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ABSTRACT 

We use hydro dynamic cosmological simulations to explore the evolution of 
the intergalactic medium (IGM) transmissivity from z=2 through the epoch 
of reionization. We simulate a concordance ACDM model in a 9.6 Mpc box 
with a comoving spatial resolution of 37.5 kpc. Reionization is treated in 
the optically thin approximation using an ultraviolet background (UVB) that 
includes evolving stellar and QSO source populations. In this approximation, 
ionization bubble overlap is treated by ramping up the UVB over a finite 
redshift interval ~ 0.5, consistent with the inhomogeneous reionization 
simulations of Razoumov et al. (2002) for several reionization redshifts. We 
construct noiseless synthetic HI Lya absorption spectra by casting lines of 
sight (LOS) through our continuously evolving box and analyze their properties 
using standard techniques. Different spectral resolutions are also studied by 
convolving full resolution data down to R=36,000 and R=5,300 depending on 
redshift. Parametric fits to the data are provided based on either analytic 
approximations or straightforward regressions. 

We find a smooth evolution of the effective optical depth under a power law 
with a slope of 4.16 ± 0.02 up to the epoch of reionization. The smooth profile 
can also be fitted to the Songaila and Cowie (2002) parametrization F(g,z). 
The normalized ionization rate g is then recovered through our spectra which 
agrees, within the error of the fit, with the input ionization rate we used in the 
simulation. As we cross into the epoch of reionization, the mean transmitted 
flux (MTF) and variance to the mean transmitted flux sharply deviate from 
a smooth evolution. However, the simultaneous sharp increase of the variance 
and sharp decrease of the mean transmitted flux introduces large margins of 
error which place a high degree of uncertainty to the computed optical depth 
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evolution profile. A distinction between high and low transmission lines of sight 
shows that the two subsamples skew the results towards two different directions 
that may infer a continuation of a smooth profile or underestimate the global 
transmissivity properties during reionization. However, despite the statistical 
uncertainty in inferring the reionization profile from spectra, the end of an 
opacity phase transition of the IGM correlates well with the redshift when both 
the mean and variance of the transmitted fiux rapidly deviate from a smooth 
profile. Furthermore, we quantify the relation between the line of sight and 
the cosmic fiux variance, which is computed from the statistical average of the 
fiux variance along all lines of sight and conclude that because the latter is a 
lower bound estimate due to limitations imposed by our box size, it is a more 
sensitive tool than the MTF in mapping reionization. It nonetheless causes the 
distribution of mean fluxes along lines of sight to have an increasing flatness as 
the redshift increases. We estimate that in our cosmic realization of reionization 
and regardless of spectral resolution, an unobtainable number of lines of sight is 
needed to yield a normal distribution of LOS-mean fluxes that would allow an 
estimate of the MTF with less than 10% relative margin of error. 

In addition to optical transmission, we compare the predicted dark gap 
length distribution with observations. We show that this statistic is sensitive 
to spectral resolution at reionization redshifts, but overall in agreement with 
results by Songaila and Cowie (2002). Finally, we derive a positive correlation 
between the mean optical depth within a gap and the size of the gap, which 
relates "transmission statistics" to "dark gap statistics" in high redshift studies 
of the IGM. 

Subject headings: early universe — intergalactic medium — quasars: absorption 
lines — galaxies: formation 



-4- 



1. Introduction 

The detection of quasars at z > 6 (Becker et al. 2002; Fan et al. 2002, 2003; Hu et 
al. 2002) suggests that the intergalactic medium is highly ionized hy z = 6 and therefore 
reionization began at redshifts z > 6 due to UV emitting source other than quasars (QSOs). 
Quasars are not considered to be such sources because their comoving number density 
decreases rapidly at z > 4 (Shapiro & Giroux 1987; Madau, Haardt & Rees 1999). By 
virtue of the WMAP results (Bennett ct al. 2003) an era of reionization is associated with 
the epoch of early star formation at ^ ~ 17. The number of massive PopIII stars forming 
then produces sufficient UV photons to at least partially ionize the IGM (Barkana & Loeb 
2001, 2003; Haiman & Holder 2003). In addition, the discovery of a large number of Lyman 
Break Galaxies (LBG) (Steidel et al. 2003) at redshifts {z > 3) could also explain the 
completion of reionization by z ~ 6 due to the population of proto-galaxies that form earlier 
{z < 9). In that case, massive galactic type stars, which are abundant at z > 6 (Madau et 
al. 1999), would be the ionizing sources of a galaxy-dominated UV background (Haenhelt et 
al. 2001). Simulations can be "fine-tuned" to describe one or the other scenario (Razoumov 
et al. 2001; Giardi et al. 2003; Sokasian et al. 2003). Regardless of whether such results 
will become the standard accepted theory in the study of reionization one conclusion has 
gained substantial confidence. Hydrogen reionization was most likely caused by a soft, 
stellar type UV radiation field with a large softness index {S — Yh\i ^ ^^^^ between 
redshifts 2; = 6 — 15. 

The large optical depth due to electron scattering detected by the WMAP places 
the beginning of reionization much earlier than the highest redshift Lya emitter (SDSS 
1148+5251) detected to date in ground based observations at z = 6.56 (Hu et al. 2002). 
If the Universe was permanently globally reionized after ^ = 15 then the Lya opacity 
evolution is expected to be smooth down to small redshifts (Songaila 2004). However, 
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Lyman-o; forest observations in the spectrum of SDSS 1030+0524 (z=6.28) shows an optical 
depth trough at ^ = 6.05 (Becker et al. 2002, Songaila & Cowie 2002; Fan et al. 2002). 
The lack of transmitted flux has been interpreted as the detection of the reionization 
tail. However, Songaila (2004) has shown that the previous conclusion was based on an 
incorrect conversion of Ly/3 opacities to Lya opacities. The correct conversions show a 
smooth evolution of the optical depth up to z = 6.3. This suggests that the trough in the 
SDSS 1030+0524 spectrum might be a 'dark gap' occurrence due to the large variance in 
absorption at high redshifts, although it is not clear whether the hue of sight variance (Fan 
et al. 2002) or the cosmic variance (Songaila 2004) is responsible. 

In this paper we investigate the evolution the mean transmitted flux and flux variance 
using hydrodynamic cosmological simulations of the Lya forest in a concordance ACDM 
universe. We adopt a picture of hydrogen reionization that begins at ^ a; 7 due to 
the galactic radiative feedback and is rapidly completed hy z ^ 6.5. That does not 
exclude the possibility of recurring reionization events prior to z ~ 7 but does exclude a 
single reionization instance at z ^ 15. In our picture, the transmitted Lya flux evolves 
smoothly from small redshifts up to 2; ~ 6.5 which is consistent with the Songaila (2004) 
result. Knowing the exact moment and shape of reionization in our simulation allows the 
investigation of the statistical properties of the Lya transmission using synthetic spectra 
even within the reionization tail. We find that, although the mean transmitted flux deviates 
from the extrapolated smooth evolution as it enters the reionization tail, the large scatter 
in the transmitted flux from one line of sight to another makes it a poor indicator of 
reionization. An observation could miss the reionization tail or underestimate the redshift 
at which the tail begins. Instead we propose that the line of sight variance be used for 
tracing the reionization tail because it shows a larger sensitivity to the hydrogen opacity. 
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2. Simulations 

We have performed a ACDM hydro-cosmological simulation using our Eulerian code 
Enzo (Bryan k Norman 1997; Norman k Bryan 1998; Bryan et al. 1999) with A = 0.73, 
h = 0.71 {Ho = 100/i kms"^), fl^ = 0.27 and Qf, = 0.04. The primordial distribution of 
gas and dark matter was initialized in a comoving box of 6.816/i~^ Mpc using the linear 
power spectrum from Eisenstein k Hu (1999) with as — 0.94. The cosmic, hydrodynamic 
and ionization evolution of the IGM was then computed from z=99 to z=2.0. Enzo solves 
the coupled system of the multispecies, fluid, self-gravity and ionization equations on an 
Eulerian comoving grid for the redshift evolution of the 3D distribution of the gas variables 
(e.g. temperature, density etc.). The spatial distribution of 256^ coUisionless cold dark 
matter particles is also computed at every time-step and is used to calculate the large-scale 
gravitational field in the box. The grid resolution of 256^ cells gives our simulation box a 
comoving spatial resolution of 26.62/i~^ kpc. Bryan et al. (1999) concluded that in order to 
resolve the Lya forest in a numerical simulation one needs at least 40 kpc spatial resolution. 
For our choice of the Hubble constant and box size, we resolve comoving scales of 37 kpc 
per grid cell. 

Enzo uses a volume averaged UV background that evolves with redshift and photo- 
ionizes the primary cosmic chemical elements (H k He). Until recently, Enzo used the mean 
intensity spectrum computed by Haardt k Madau (1996) (HM96) which was based on 
quasar counts alone. Such a spectrum is insufficient to describe the ionization and thermal 
state of the IGM at redshifts z > 5. Our current understanding of hydrogen re- ionization 
places the onset of re-ionization at redshifts z > 6.5 where quasar counts alone cannot 
provide the needed flux for the IGM to ionize. One possible source for the missing flux is 
the population of dwarf-galaxies, which form in shallow gravity wells between 2; = 12 — 7 
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and contribute to a soft, stellar UV background. The other possible source for the ionizing 
flux due to PopIII objects at redshifts 2; > 15 is not addressed in this paper. 

Haardt & Madau (2001) (HMOl) have computed the redshift evolution of the volume 
averaged UV intensity which takes into account the evolving populations of both galaxies 
and QSOs. We have incorporated into Enzo this radiation background in the form of the 
frequency-integrated photo-ionization rates, Fj = ^^^^^f^ai{i')dv, and photo-heating 
rates , Gi — ^^^^^^^(Ti{'u){hv — hvi)dv. In our notation, i indicates the chemical species 
(i=HI,HeI,HeII), i/j the species' ionization frequency, (Tj(z/) the ionization cross section and 
J{iy,z) the mean intensity frequency distribution at redshift z. The photo-ionization and 
photo-heating rates are used to update the local chemical abundances and local gas energy 
respectively. In Figures (1) and (2) we show parametric fits to the redshift evolution of the 
species' photo-ionization and photo-heating rates. At redshifts ^ > 4 we note the effect that 
the galactic contribution has on the UV flux. The QSO contribution diminishes rapidly 
at z > 3 due to the decrease of their comoving number density. However, the galactic 
component, which peaks around z ~ 4, provides enough ionizing flux to at least ionize HI 
and Hel (the softness of the stellar radiation makes it a poor ionizer of Hell which requires 
a harder spectrum). 

A self-consistent calculation will compute the 3D propagation and percolation of the 
ionization fronts (I-fronts) emanating from the UV production regions (Shapiro & Giroux 
1987; Abel & Haehnelt 1999; Gnedin 2000; Razoumov et al. 2002; Giardi et al. 2003; 
Sokasian et al. 2003). In that scheme, reionization begins when the individual I-fronts 
begin to merge and is completed when the volume filling factor of ionized matter is ~ 1. 
Prior to the merging, regions of fully neutral hydrogen in the space between the I-fronts 
have large column densities and therefore large optical depths. Quantitatively, a comoving 
stencil (Ax) of neutral hydrogen at the cosmic mean gas density will have a column density 
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of Nhi ~ 2.6 10^6 VL^h?{l + zf (^) cm-^. At z=7 and for Ax 40 kpc, the spatial 
scale that resolves the Lya forest (Bryan et al. 1999), we get Nhi ~ 1-5 lO-*^^ cm~^ which 
corresponds to TLyc ~ 9.3. Despite this seemingly large continuum optical depth the 
I-fronts will eventually burn through as the number density of the ionizing sources increases 
and the volume is rarefied by cosmic expansion and structure formation. Since most of the 
volume is at overdensities below the cosmic-mean the I-fronts are rapidly propagating in 
underdense regions while slowly ionizing denser clumps of neutral material. This mechanism 
of reionization accelerates as the I-fronts merge because at that point two or more UV 
sources can ionize the same region of space. This accelerated pace is illustrated in the steep 
decrease within a very short time (A^^eion = 0.2 — 0.3 or 25-40 Myrs) of the Gunn-Peterson 
optical depth computed by Razoumov et al. (2002). 

The study of reionization using a rising uniform UV background, which is applied 
at every point in the cosmic volume, lacks the mechanism of percolating I-fronts and 
therefore cannot self-consistently simulate the effect of reionization via the merging of 
ionized regions. In order to proceed we need to closely emulate the current understanding 
of reionization mechanics in our numerical setup. First, we need to choose the redshift at 
which reionization begins which corresponds to the phase of initial merging of the I-fronts. 
At that point, we initialize our background UV field to a tiny value. This approach is not 
far from reality because, despite the presence of bubbles filled with UV-radiation in the 
volume, most of the hydrogen mass is not yet ionized. The effect of our initialization is 
to begin to ionize the most underdense regions with a value for the photo-ionization rate 
that does not yet have an effect on higher density regions. The last claim is based on the 
fact that higher gas densities will have smaller recombination time-scales than lower gas 
densities and therefore achieve ionization equilibrium faster. The underdense regions on the 
other hand may never achieve ionization equilibrium and therefore their state of ionization 
is effectively determined by the ionization time-scale (r~^ pa 0.16 Myrs between z=7-6). 
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Ignoring the effects of recombination for small overdensities allows for a simplified estimate 
of the time required for a fixed volume element 5V to be fully ionized. This is determined 
by the balance between the number of ionizing photons per unit time and the available 
targets per unit volume to be ionized Srion — "^p'^^ ■ The last expression suggests that for 
a uniform ionization rate the most underdense regions photo-ionize first. This 'bottom-up' 
ionization mechanism emulates the preferred I-front propagation channel which is the 
expansion into the underdense IGM. 

Our second 'free' parameter is the choice for the time-scale at which we would need to 
ramp-up the ionization (and heating) rates to a volume averaged values that is physically 
valid in the cosmic medium at high redshifts. In reality, the ramping of the volume- averaged 
rates is expected due to the rapid increase of the UV-radiation volume-filling factor as the 
I-fronts merge. In addition, as the cosmic volume becomes transparent to more sources of 
radiation photo-ionization of the IGM accelerates. Since the current highest-redshift Lya 
emitter lies at z—6.56 (Hu et al. 2002) our choice for the onset of reionization needs to be 
^reion > 6.6. Wc adopt thc conclusions from the galactic ionization model by Razoumov et 
al. (2002) and set Zreion — 7. In addition, we ramp the ionization rates to the HMOl values 
hy z — 6.5 using an analytic ramp function with a skin-width of Azreion — 0.3. 

Choosing an earlier initialization redshift for the ionizing background does not have 
any measurable effect on the opacity of the IGM at z < 6.5. We demonstrate this in 
Fig. 3 where we compute the redshift dependence of the Gunn-Peterson optical depth 
(Peebles 1965) tqp ~ 4.4 10^ xhi i^bh^) (1 -|- zy^^ for five cases using a lower grid 
resolution simulation (128^ grid cells) of the same box size and substituting for the xhi 
the mean HI neutral fraction in the volume. Our first case (AO) uses the QSO-only HM96 
spectrum where we extrapolate the fit to the rates beyond z=5.0 in order to initialize the 
radiation field at Zon — 7. The cases A through D correspond to the HMOl spectrum and 
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use the ionization rates from Figure (1). They differ only in the choice for the redshift 
of the reionization onset. Our redshift of interest hes at z < 6.5 where there are current 
observations. Henceforth, we find no difference in the optical depth evolution between the 
four models in that redshift range. Therefore, we choose Zon — 7 for consistency with other 
theoretical and numerical models. 

Due to our limited box size, we terminate our calculation at z—2. However, since 
our focus is the high redshift Lya forest, that our computational box contains enough 
mass power at these redshifts to make it a representative cosmic realization. A recently 
completed simulation of a 54h~^ Mpc cosmology box with a grid resolution of 1024^ grid 
cells and 1024^ dark matter particles will address the lack of matter power at large scales. 
We plan to repeat this work for that simulation in order to determine if the deficiencies of 
the current simulation has an effect on our conclusions. 

3. Synthetic Spectra 

The method for generating synthetic spectra of the Lya forest is described in Zhang et 
al. 1997, Bryan et al. 1999 and Machacek et al. 2000. We begin the spectrum calculation 
by selecting a point in the volume from which we cast random lines of sight. Along a 
line of sight (LOS) we integrate the optical depth of the redshifted Lya photons that are 
scattered at the rest frame of reference of an absorbing grid cell. The optical depth is given 
by Equation (1) (Zhang et al. 1997), where a is the cosmic scale factor and the integration 
limits are between the redshift of the initial point (equal to the redshift of the simulation 
dump) and the redshift at which we wish to measure the Ly-a absorption. 

ru{z) = ^ • / X exp{-[{l + z')--l + -f{-r-f}dz' 1 
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In our notation and i^l are the resonant cross section and rest-frame scattering 
frequency for ion i (e.g. HI or Hell). 6' = is the local thermal speed and rf is the 

local proper number ion density. The projected velocity along the LOS at the local cell, v, 
is the sum of the projected peculiar velocity plus the Hubble expansion speed. The input 
fields used to generate the synthetic spectra are the grid distributions of gas temperature, 
the three components of the gas peculiar velocities and the ion proper density. Our redshift 
integration range is Az — 0.1 which corresponds to the frequency of the simulation dumps. 
The optical depth integration assumes that the input fields do not have any comoving 
evolution between Zcube ^-nd Zcube ~ 

Each hne of sight is continuous through redshift space. At the end of integration we 
store the position of the last absorbing cell and the direction of the ray and use them at 
the beginning of the integration step through the next simulation dump. The size of our 
simulation box is smaller than the distance traveled by the Lya photons in Az — 0.1 for 
all redshifts z > 2. Therefore each LOS exits and re-enters the periodic computation box 
several times before the integration step is completed. The number of exits and re-entries 
depends on the redshift due to the volume's proper size increase with cosmic time. For 
more details on the subject we refer to Zhang et al. (1997). 

The transmitted flux at every redshift point is simply F^, — exp{—T,^). We call 
the arithmetic mean of the transmitted flux along a LOS in a redshift bin [z,z-52;] the 
mean value F{zmean) =< Flos >8z at Zmean = z — 0.5 5z. The LOS effective optical 
depth at z = Zmean IS then defined by Teff{z) = —ln{F{z)). Following Gaztanaga & 
Croft (1999), we define the LOS flux variance, the variance along a hne of sight, through 
a LOS — Vdf^iF) — <F^o/>2 ~ '^^^ normalization by the mean flux is in analogy to 
normalizing the density fluctuations to the mean density. The mean LOS-variance (MLV) 
is then equal to al^^v = nZos ^los=i <^los- 



- 12 - 



The mean values from all lines of sight constitute a population of random samples 
distributed about the expected value of the mean transmitted flux at that redshift. 
Therefore, we call the mean transmitted flux (MTF) in the redshift bin [z,z-5^] the LOS 
averaged mean MTF{z) = T,los=i{< ^los >5z)- The unbiased variance of the MTF 

at redshift Zmean IS Computed by al^^^ = Var\MTF) = j^j^, Ewsiii^jM^ - 
where we follow the same definition convention as with a single line of sight. Finally we 
define the total MTF-variance (TMV) as the product TMV = NLOS x aj^Tp (see Section 
4.2). 



4. Results 

A total of 75 random lines of sight were computed from z—6.6 to z—2. Spectra 
synthesized with Voigt profiles beyond z > 6.6 did not show any transmission above our 
flux cutoff (= e~^° — 2.1 10~^). Each integration redshift bin of Az — 0.1 was resolved 
by 30000 points which results in a maximum redshift resolution of i?^ = 3 x 10^. Our 
spectral resolution then becomes a function of redshift, Rx = Rz x {1 + Zmean) where Zmean 
is the mean of the redshift interval. Three resolutions are then considered in the present 
work. The full resolution case (FRES) uses our raw synthetic spectra. A high resolution 
(HRES) case convolves our synthetic spectra, in a redshift interval Az and a mean redshift 
Zmean, with a gaussian at the HIRES spectrograph resolution of i? = 36, 000. It is used 
to compare with low redshift observed data. A low resolution (LRES) case convolves our 
synthetic spectra to the ESI (Sheinis et al. 2000) resolution of 5, 300. We use the term 'low 
resolution' in comparison to the FRES case. Values of i? > 5, 000 are actually quite high 
resolution for observed data at 2; > 4. Unless we state otherwise we apply the LRES case 
only at 2; > 4.5. 
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In Figure (4) we show four samples of a synthetic Lyct forest absorption spectrum 
along a single random line of sight at Zmean = [3.05, 4.05, 5.05, 6.05] in redshift intervals of 
= 0.1. At Zmean = 3.05 and Zmean = 4.05 wc plot the transmitted flux in the HRES case. 
The next two bins {zmean — 5.05 & Zmean — 6.05) are plotted in the LRES case. Between 
z ~ 4 and z ~ 5 an increasing number of low transmission regions (dark gaps) appear in 
the Lya forest that, as we shall show, are underionized high opacity overdensities. The size 
of the dark gaps increases between z ~ 5 and 2; ~ 6 as the smaller gaps at lower redshifts 
'merge' and the amplitude of the high transmission region decreases. 



4.1. Mean Transmitted Flux Evolution 

In Figure (5) we plot the mean transmitted flux (MTF) versus redshift in 30 redshift 
bins with size 5hin = 0.15 (solid line) which corresponds to 5\ = 186 A. Overplottcd 
(diamonds) are the combined samples of HIRES and ESI data from Songaila (2004) and 
the individual transmitted fluxes from each simulated line of sight. In Figure (6) the 
transmitted flux is converted to optical depth. The red crosses show the evolution of 
the effective optical depth, Tg// = —ln{MTF{z)). The smoothness of Tg// persists in 
our calculation up to Zmean — 6.36 which is consistent with the conclusion by Songaila 
(2004). The sohd red hne through the red crosses is a power law least squares fit between 
Zmean = 6.36 — 2.07. The last red cross in Figure (6) lies well within the reionization tail 
and was not included in the fit: 

/I I 4.16±0.02 

reff = 2.11°;- (^) (2) 

The cosmic parameters in our simulation were chosen in order to match low redshift 
observations and indeed the match with observed data at z < 3.5 is good. The blue 
lines in Figure (7) are a measure of the scatter of the LOS mean transmitted flux. They 
correspond to the minimum and maximum LOS mean values (converted to optical depths) 
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in each redshift interval. Within the LOS scatter range we are in decent agreement with 
the observed values across the entire redshift space. 

The solid hnes in Figure (6) correspond to the full resolution of our synthetic spectra 
(FRES). The spectral resolution does not have an effect on the mean flux value in a redshift 
interval but it does alter the scatter of the data about the mean. Low resolution spectra will 
have smaller scatter about the mean as is illustrated by the dashed blue lines in Figure (6) 
where we plot the LOS scattering in the LRES case for z > 4.5. The HRES case gives 
identical results with the FRES case and is ignored here. 

Songaila and Cowie (2002) derived the following parametric flt 

F{z, ^) = 4.5 (^) ' ' exp (-4.4 g-'-' (i±^) (3) 

for the mean Lya transmission as a function of redshift which closely matches the ACDM 

calculations by Cen & McDonald (2002) in the redshift range of z=6-4. In the context of 
an uniformly ionized IGM, parameter g is the normalized ionization rate (Mc Donald & 
Miralda Escude 2001). 

^ - ^0:0325^ ^65 Km s'^ Mpc'^^ 

Following Songaila & Cowie (2002) and Songaila (2004) we assume a power law dependence 

of the form g = g^'^* — hi (^^-|^)^^ and calculate the pair of coefficients hi and 62 that 
closely matches Equation (3) to the computed optical depths in Figure (6). Our calculation 
yields 61 = 0.60 and 62 = —0.91. The match between MTF{z) and the analytic fit has an 
error of ~ 4% in the redshift interval z=4-6.4. This tight fit is shown on the left panel of 
Figure (7) (dashed line) and is only valid in the range of 2; = 4 — 6.4. The least squares fit 
is overplotted as a solid line and has approximately the same margin of error (4%) in the 
same redshift interval. However, the use of Equation (3) instead of Equation (2) allows for 
the determination of the normalized ionization rate from the synthetic spectra perspective. 
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The solid line on the right panel of Figure (7) is the normahzed ionization rate as inferred 
from the power law assumption in Equation (3). The effect that the la deviation of the 
scaling factor bi, has on the ionization rate fit is shown as small dashed lines. The deviation 
to the power law exponent 62, was not considered in this graph because the redshift profile 
of Equation (3) is very sensitive to that value. 

The normalized ionization rate can also be computed directly from the simulation data 
through Equation (4) if we substitute for the gas temperature (T4) the volume averaged 
temperature derived from our simulation data dumps. This rate is overplotted as the 
dashed-dot curve on the right panel of Figure (7) and does not have a power law profile. It 
is located however within the boundaries set by the la deviation of the scaling factor bi and 
has a mean absolute deviation from the power law fit ((7^**) of ^ 20% in the redshift interval 
of interest (4-6.4). An improvement to this rate can be sought (Appendix I) if we intuitively 
divide Equation (4) by the ratio where Chii — <p^//>2 is the HIT clumping factor 
and Cb —< S'^ > the baryon clumping factor. The adjustment was motivated by the fact 
that in a non-uniform IGM close to reionization the clumping factor of ionized hydrogen 
(HII) is only approximately equal to the clumping factor of hydrogen (H). This marginal 
effect is ignored in the derivation of Equation (3). The latter accounts for a clumpy baryon 
distribution but not for the relative difference between Chu and Cb- We compensate for 
that by adjusting the normalized rate rather than the functional form of Equation (3). The 
result is shown as a thick dashed line on the right panel of Figure (7). The mean absolute 
deviation between the raw simulation data and the power law fit (spectra) is then improved 
from pa 20% to ^ 12%. Nonetheless, the shape of the input ionization rate still does not 
conform to that of a power law. 

The fits we considered to analytically represent the mean flux evolution at 2; > 4 do 
not include the last point at Zmean — 6.52 ± 0.08 which is located above the fit-curves in 
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Figure (6) and Figure (7). In addition, the error of the fits to the flux data is improved if 
we also exclude the mean flux at z„iean = 6.36 ± 0.08. Those two points lie either inside 
or too close to the reionization tail which in our setup is located at redshifts z > 6.4. To 
better relate observables to the underlying physical properties we show on the top-left panel 
of Figure (8) the proflle of reionization, as is traced by the mean baryon fraction in neutral 
hydrogen (fni =< >) between z=5-7. The HI baryon fraction drops from ^ 0.76 at 
z — 6.8 to < 10~^ within Az — 0.4. We ramp up the ionization/heating rates from a tiny 
number (10~^°) at z=7 to the HMOl values at z=6.8 for numerical stability of the chemistry 
solver and therefore we do not trust computed species abundance in that redshift range. At 
redshifts z < 6.3 J hi evolves smoothly with redshift which yields the smooth evolution of 
the effective optical depth we measured in the synthetic spectra. 

To quantify the reionization profile we fit the HI baryon fraction evolution between 
z=5-6.3 to a linear-log parametrization, logiff^"*^) = 0.39(±0.01) {I + z) - 6.96(±0.03), 
and then subtract the fit (extrapolated at z > 6.3) from log{fHi)- The result measures the 
degree at which Jhi departs from the smooth evolution at 2; > 6.3. This 'reionization profile' 
is normalized by its' maximum value and fitted to a step function {Freion}- The mean HI 
baryon fraction is then recovered through log{fHi) — log{fHj°°^^) -1-4.06 x (1 — Freion)- Each 
point along this profile can be interpreted as the "Reionization Completion Parameter" 
(RCP). Reionization at 50% completion (RCP=0.5) sets the HI baryon fraction at about 
100 times more that the one inferred by extrapolating the smooth evolution from lower 
redshifts. The Freion profile shows that spectra obtained between z=6.60-6.46 (the highest 
redshift interval in the effective optical depth plot) sample the Lyct transmission when 
reionization is between the 20-98 % completion level. 

The MTF and variance evolution from our synthetic spectra are plotted in the high 
redshift interval (z=5-7) on the bottom panels of Figure (8). The mean transmitted flux 
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(bottom-left) evolves with redshift under the same power-law profile (solid-straight line) 
up to 2; ~ 6.25. It is then followed by an order of magnitude decrease within 5z = 0.5. 
Overplotted are the margins of error to the MTF at two confidence levels (CL) (bars: 
CL=68% open: CL=90%). The margins of error were determined by the distribution of the 
LOS mean fluxes at each redshift bin and they systematically increase with redshift for a 
fixed number of lines of sight. This is due to the increase with redshift of the LOS-variance 
which is used to compute the margin of error. The LOS-variance dependence on redshift 
is expected because it is effectively determined by the cosmic variance which increases 
as neutral hydrogen becomes more abundant. Small values of cosmic variance will yield 
mean fluxes along a set of lines of sight that are closely clustered about the LOS-averaged 
transmitted flux and therefore returning a small value for the LOS-variance. On the other 
hand, mean fluxes computed along a set of lines of sight would be widely spread (large 
LOS-variance) if the cosmic variance at that redshift has a large value. 

The margin of error determines whether a particular mean transmitted flux value 
statistically deviates from a smooth evolution or not. The MTF value in the redshift interval 
z=6.28-6.44 cannot be used to conclude a deviation from a smooth evolution because the 
upper error margin (at both confidence levels) includes the extrapolated curve from lower 
redshifts. That particular redshift interval should be statistically excluded from being in the 
reionization tail even though it samples the last 1% of it. However, this conclusion is based 
upon the available number of lines of sight since the margin of error scales as (nios)"^- It 
is highly unlikely though that near future observations at high redshift will exceed 75 Lya 
transmission lines. If we consider fewer lines of sight then the increased margin of error will 
only solidify our conclusion about the particular redshift interval. Therefore, transmitted 
flux values recorded close to the end of reionization give no "statistical" indication of 
sampling the tail despite the high probability that an individual transmitted flux value may 
lie well beneath the smooth evolution curve. 
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The transmitted flux values measured in the redshift interval z=6.44-6.60 are located 
well within the reionization tail. They in fact sample most of reionization's completion 
level curve. The mean transmitted flux is 0.95 dex below the extrapolated power law 
of Equation (2). The upper error bar at CL=90% is also below the power law curve by 
0.55 dex which statistically places that redshift interval within reionization. However, 
there is a non-zero probability that a single line of sight would yield a value for the 
transmitted flux above F^*„ = exp{-T^l^{z)) where t^^I. = (2.1 + 0.16)(i±^)^-^6+°-°2 from 
Equation (2). That probability depends on the extent of the redshift interval in which 
the LOS-spectra are sampled. The larger the extent the more low transmission pixels are 
included from higher redshifts and therefore the probability of a high transmission pixel 
becomes smaller. For A2; ~ 0.16 the probabihty that a hne of sight would yield a mean 
flux value more than is P = P{Flos > F^L) ~ 1^ instead we resolve the last 
redshift interval with four bins of Az = 0.04 then P=[0.2,0.08,0.01,0.001] in each of the 
intervals [6.44-6.48], [6.48-6.52], [6.52-6.56] & [6.56-6.60] respectively The mean redshift in 
each bin corresponds to RCP values of RCP=[0.97,0.90,0.70,0.28] which suggests that the 
probability of a single line of sight missing the last 10% of the reionization epoch is more 
than 8%. Although this number is relatively small and model dependent it illustrates the 
importance of variance when synthesizing or observing spectra in proximity to or within 
the reionization tail. 

The right-bottom panel shows the evolution of the TMV-variance (solid line) and 
MLV- variance (dashed: FRES dashed-dot: LRES). We note that the LRES variance is 
smaller when compared to the FRES case as expected. The TMV-variance is the same in all 
cases since the mean flux is not affected by the spectral resolution. The data were fltted to 
linear-log proflles for 2; < 5.8 which are overplotted in the flgure. All types of variance show 
a break from a linear-log proflle for 2; > 5.8 which is earlier (time moves backward in this 
picture) than the corresponding break of the MTF from it's fit. However all cases remain 
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within la from their lower-redshift fit curves up to 2; = 6.28. In tfie following section, we 
examine more closely the difference between the two types of variance we defined and their 
significance in mapping reionization. 

4.2. Flux Vciriance Evolution 

In section (2) we defined the MTF-variance as the measure of dispersion of the 
LOS- mean flux values in each redshift bin about the mean transmitted flux (MTF). In 
addition, we defined the MLV- variance as the LOS-average of the variances along individual 
hnes of sight. We plot in Figure (9) the total MTF-variance TMV = NLOS x aj^^F and 
MLV-variance against redshift from z=2.5-6.6. The black triangles show the TMV-data 
while the dashed and dot-dashed hnes connect the MLV-data for the FRES and LRES 
cases respectively. We find only a very small difference between FRES and HRES cases at 
z < 4.5, therefore only the last one is shown in that range. Overplotted as color-points 
are the variance values from the individual hnes of sight (green: FRES; blue: HRES; red: 
LRES) which are scattered about their respective MLV lines. Across the redshift range 
z=2.5-5.8, a linear-log least squares fit in the form log{Vary^^ = Aq + Ai{1 + z), matches our 
variance data from Figure (9) with better than 5% mean absolute deviation. The constants 
Aq and Ai depend on the resolution and the type of variance and are given in Table-I. The 
redshift point z=5.8 was selected visually because of an apparent break of the TMV-data 
from a straight line there (solid line in Figure (8)-Bottom Left panel). However, only data 
at 2; > 6.28 have variance values \\log{Var) — log{Var)f'^*\\ > -I-(7^i(1 -\-z). This redshift 
value matches the point of departure from a smooth profile of the MTF evolution shown in 
Figure (8) (Bottom Right panel) and corresponds to the redshift interval (6.28 < z < 6.44) 
which samples the very last stages of reionization. It is however, the simultaneous significant 
break from a smooth profile of both the mean flux and variance that a large enough margin 
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of error is introduced to the MTF to infer a possible continuation of a smooth evolution 
through that redshift interval. 

The steep increase of the transmitted flux variance as we cross into the reionization 
phase is expected because the variance at each redshift is equal to the total transmitted 
flux power (Tytler et al. 1997). In turn the total transmitted flux power is proportional 
to the total mass power of neutral hydrogen which decreases rapidly as reionization takes 
place. Our computation volume is limited in that respect because the largest length scale 
it simulates, (smallest wave-number) corresponds to the box size of 6.816/i^^ Mpc. Larger 
length scales can be sampled by wrapping a synthetic line of sight through the simulated 
volume. This is equivalent to assuming that the cosmic volume is an ensemble of such 
volumes. However, in doing so there is no additional gain in flux power. In this section we 
attempt a description of the flux variance cosmic evolution which within the limitations of 
the present simulation is a lower bound estimate. 

A redshift profile with the functional form Var = — 1 + Co(l + zY^F'^^ (c2 < 0) can 
be inferred from theory (Appendix II) for the transmitted flux variance. This equation 
was derived by computing the second moment of the local transmission over the volume 
distribution of densities following Songaila and Cowie (2002) and it should generally be 
valid in the post-reionization era (z=4-6.25) where Equation (3) F(g,z) also fits our optical 
depth data. In the range of 7 = 1 — | the constant Ci <^ 1 =^ (1 + zY'^ 1 (Appendix II). 
Using the discrete LOS variance values in the FRES case from Figure (9) we can estimate 
that between z^ = A to Z2 = 6.28 the ratio y^^|^^|^| scales between 16.6-25.7. This range 
compares well with the one inferred from ~ {^Y'^ — 16.6 — 22 for 7 = 1 and 

7=1 respectively. In the last approximation, the values of F{zi^2) were obtained from 
the Equation (3) fits to our data (Figure (7)-Lcft panel). The validity of the analytic 
expression does not however suggest a deterministic dependence of the variance to the 
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mean transmitted flux but rather reflects that both quantities are correlated through their 
dependence on the mean IGM opacity. 

Both the MTF- variance and the mean LOS- variance are measures of the "cosmic 
variance" from two different perspectives. The MTF-variance measures the normahzed 
dispersion of a random sample of LOS-mean fluxes. MLV in our setup measures the mean 
value of a random sample of normalized LOS-variances. Statistically the mean value x of 
a sample of random values Xi drawn from a larger population is a point-estimate of the 
population mean fi. On the other hand, the variance of N random means xi underestimates 
the variance to the population mean by a factor N. Thus, we multiply the computed 
MTF-variance by NLOS (number of lines of sight) to recover the total MTF variance 
(TMV) which is an 'estimate' of the cosmic flux variance. Figure (9) shows that for redshifts 
2; < 6.28 both types of variance have values within the scatter of the individual LOS data 
(points). The comparison also shows that only the FRES/HRES MLV-data can be related 
to the total MTF-variance results which are not sensitive to resolution. 

Despite both variance types scaling in a similar manner with redshift, they are 
not equal because of the different normalization and therefore a unique measurement 
of the cosmic variance of the transmitted flux cannot be derived from either of them. 
To directly compare one type of variance to the other, we are required to revert to the 
standard (un-normahzed) variance deflnition. The standard total MTF variance is equal 
to alf = aljrpp ■ [MTFY ■ NLOS. The standard mean LOS- variance is similarly equal to 
'^i ~ NLOS ^j=i'^ '^j^j where Fj is the mean flux along a line of sight j ^. In theory, if 
the individual LOS-mean fluxes (Fj) are independent random measurements then o"^ = erf, 
(Appendix III) . 

is used in literature to describe the linear mass variance. Here the notation stands 
for the weighted mean of the variances along lines of sight. 
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On the top left panel of Figure (10) we plot Log{aM/MTF) and Log{aL/MTF) in the 
redshift range z=2.5-6.6. The first quantity is TMV from Figure (9) (triangles). The second 
quantity is the renormalized MLV- variance in the FRES case (dashed-line) . Parametric 
fits to the data following a linear-log profile are provided in Table-I, which shows that the 
of the fits increases if we include values up to z < 6.25 compared to the fits obtained 
from z < 5.8. Nonetheless, the curves only break away from a linear-log profile above 
the 1(7 scatter of the fits upon crossing into the reionization phase in the redshift interval 
[6.28-6.44]. The value for the variance increases by 2dex between Zmean — 6.2 — 6.5. The 
decrease in the MTF (ignoring errors) is ~ 1.4dex in the same Zm^an range. This difference 
in the degree of change of the two quantities (mean fiux and fiux variance) within the 
reionization tail has a significant implication. The baryon mass in our computation box has 
a mean value equal to the cosmic mean and therefore we do not expect the calculation of 
the MTF to be appreciably biased by the finite volume. However, as discussed previously, 
our calculation of the fiux variance in our finite volume, yields at best a lower-bound 
estimate. This implies that a bigger computation volume or line of sight observation, which 
would better sample the mass power spectrum at large scales, would in turn yield a larger 
fiux-variance and a bigger degree of change within the reionization tail. Therefore, we can 
conclude that the difference in the degree of change within the reionization tail between the 
mean fiux and fiux variance would be generally larger than the 0.6 dex we measured in this 
work. This suggests that the flux variance exhibits a greater sensitivity to the evolution of 
the HI distribution than the mean flux does and it could be used instead of the latter in 
mapping reionization. 

The above conclusion is based upon the assumption that both types of variances we 
defined are estimates of the cosmic variance of the transmitted fiux in our synthetic Lya 
spectra. However, the validity of such an assumption depends on whether the two estimates 
agree with one another. The logarithmic y-axis on the top-left panel of Figure (10) 
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might 'mask' any significant differences between the two quantities. We address this 
by plotting the ratio between a]^ and (t|, each normahzed to the square of the mean 
transmitted flux, in the redshift range z=2.5-6.6 (Top-left panel of Figure (10): squares). 
For comparison, we overplot (triangles) the ratio where we substitute in the denominator 
the mean LOS- variance {(jj^Lv) ■ agreement between the normahzed to MTF aj^ and 
a1 generally holds well throughout the redshift interval. On the other hand, (Jmly agrees 
with up to 2; ~ 5.5. At redshifts 2; > 5.5 the mean LOS- variance consistently 

yields smaller values than the weighted by the ratio (Flos/MTF)^ average LOS- variance. 
The difference can be explained by the fact that crifiy is a straightforward arithmetic 
average of the individual normalized variances along each line of sight. At redshifts close 
to reionization high transmission regions become rare. Most lines of sight in fact do not 
include high transmission regions in the last redshift interval (z=6.44-6.62). Therefore, 
at redshifts close to reionization, the arithmetic mean of variances will be biased by the 
large number of small flux variance spectra along lines of sight which do not include high 
transmission regions. On the other hand, the weighted average LOS-variance is dominated 
by the few high transmission regions, as we will explain in the Section (5), and therefore it 
yields a larger value. At redshifts z < 5.5 the mean LOS-variance and weighted average 
LOS-variance have approximately the same ratio to the mean transmitted flux variance. 
This is not unexpected despite the different normalization. Small values for the individual 
LOS- variances (at low to medium redshifts) suggest that the individual LOS-mcan fluxes are 
less scattered about the corresponding MTF value. Therefore, if Fj ^ MTF (j==l,NLOS) 
then {uL/MTFf ^ al^v 

Finally, it is important that one samples the cosmic Lya transmission with a 
statistically adequate number of lines of sight. Any random sample introduces a margin of 
error which becomes important to minimize at high redshifts where the cosmic variance is 
large. Small number of lines of sight will retain the skewness of the original population of 
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transmission values but as long as the distribution is highly peaked then the global average 
of mean fluxes can be trusted. For our 75 lines of sight, equal to the number of mean fluxes 
per redshift interval, we compute the redshift evolution of the skewness and kyrtosis of the 
mean flux sample. Following Croft et al. (1998) they are deflned as -f{z) = <2^^ and 
k{z) — <(^~g^) > respectively. In our notation, x = jjj^ (j — l,NLOS), = <7mtf and 
<> denotes an arithmetic averaging over all lines of sight. The results are shown in the 
lower two panels of Figure (10). We find that the distribution of line of sight mean fluxes 
per redshift interval has on the average a skewness that fluctuates about zero with redshift 
and but is consistently positive at redshifts close to reionization (z > 5.5). The kyrtosis 
on the other hand consistently decreases with redshift which shows that the distribution 
of LOS-mean fluxes becomes flatter. Therefore, if one requires an estimate of the global 
average of the transmitted flux with a flxed margin of error, a progressively increasing with 
redshift number of lines of sight is needed in order to compensate for the flattening of the 
distribution of mean fluxes, which arises from the increasing cosmic variance. 

On the left panel of Figure (11) wc plot the margin of error relative to the MTF as it 
scales with redshift. At the 90% confidence level (CL) the margin of error is ~ 60% the 
MTF value at ^ = 5.5 (~ 40% at CL=68%) and larger than 100% at z > 6.4 for both 
confidence levels. This shows that 75 lines of sight undersamples the distribution of mean 
fiuxes at high redshifts. In distributions with significant skewness the margin of error is 
different for the two ends. The margin of error reported in Figure (11) for those cases is 
the average margin of error between the two tail ends. If we assume that the LOS-mean 
flux measurements are normally distributed at all redshifts then we can estimate the 
LOS number required to determine the margin of error within 10% of the MTF. For our 
simulated cosmic volume we would need 1200 fines of sight for the MTF to be computed 
within 10% (CL=90%) at z ^ 5. At CL=68% the required number of lines of sight drops 
to about 600 ai z ~ 5. The acquisition of a few hundred high redshift Lya observations is 
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currently unattainable and therefore the large margins of errors due to small size samples 
are unavoidable. 



5. Properties of the Flux Distribution 

The distribution of LOS-mcans at high redshifts is positively skewed because they 

are derived from the positively skewed transmitted flux distributions along the individual 

lines of sight. Therefore extreme LOS-mean fluxes at high redshifts are associated with 

rare high transmission regions. That in turn results in a few "large" LOS-mean fluxes 

dominating the mean transmitted flux (MTF). The few high transmission regions also 

dominate the variance. If aj^ — a\ then the MTF-variance is the weighted sum of the 

individual LOS-variances ((7|), al^Tp = (NLOS) x ES°^ (^j Wf, where Wj = ^Ihs ■ 

Ej=i 

If there exists a number of high transmission lines with LOS-mean fluxes ^ -^jVfc then 



Wk ^ Wj^k and therefore the variance calculation would be biased by such lines of sight. 

On the top panels of Figure (12) we plot the transmitted flux (left panel) and 
flux variance (right panel) against redshift for z > 4.5 while eliminating the statistical 
contribution of the rare high transmission regions at large redshifts. We do that by 
substituting for MTF{z) MODE{Fj{z)) = MDF{z). MODE{Xj) selects the most 
frequent LOS-mean transmitted flux value. Fj{z) denotes the mean flux of line of sight j, 
at redshift z. The mean fluxes from our 75 lines of sight were binned into 24 optical depth 
bins from the minimum LOS-mean flux value (converted to optical depth) to the maximum 
one at each redshift. The number of lines of sight within each bin were then counted and 
the mode was computed from the peak of the distribution. The variance was calculated by 
modifying the crj^TF formula as follows. If Nmode is the number of lines of sight with mean 
fluxes within the bin of the mode and k is the array of indices the represents those LOS 
then a'^^ = Nj^ode x J2j=k '^j Our intent is to measure how the low transmission lines 
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of sight at high redshifts, which are more numerous and therefore have a higher probabihty 
of being measured, affect mapping reionization. 

Our calculations show that the high-z low transmission lines of sight (solid curve) will 
indeed underestimate the global average of the MTF value (dashed curve) by ~ 1.6dex (solid 
curve in lower left panel) during the last redshift interval (z=6.44-6.60), which samples the 
reionization tail. However, the use of such lines of sight will not alter the instance where 
the flux redshift evolution breaks from a smooth profile. In fact, the break is steeper when 
compared to the MTF evolution. Therefore, a sample of low transmission lines of sight 
might offer insight to when the reionization is almost complete but they will not offer a 
reliable measurement of the global average of the transmitted flux during reionization. On 
the other hand, rare high transmission lines of sight at high redshifts will miss the break of 
the MTF evolution at the end of reionization and infer the continuation of a smooth profile. 
Such is the case of the cross points on the left panels of Figure (12), where we plot the 
redshift evolution of the mean transmitted flux based on lines of sight that have LOS-mean 
values > 10"'^ in the last redshift interval. Our conclusion is reinforced by the calculation of 
the variance for the low transmission lines of sight which is indeed smaller for ^ > 6.0 when 
compared to the global MTF variance computed in the previous section (right panels) . The 
computed difference of ~ 0.4dex (in standard deviation units) in the last redshift interval 
may be considered small however it increases the confidence in the mean transmitted fiux 
value inferred by a sample of high-z low transmission lines of sight. 

Our differentiation between low and high transmission lines of sight was based upon 
examining LOS-mean values computed from the arithmetic mean of pixel flux values across 
the redshift interval of our choice (~ 0.16). Therefore, the same type of bias exists toward 
high transmission regions when computing the flux average along a single line of sight. This 
effect depends on the size of the redshift interval. With flxed redshift resolution, the smaller 
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the size of the redshift bin at high redshifts the more susceptible our mean value is to the 
presence of a high transmission region. If the majority of the transmitted flux has values 
much less than the arithmetic mean then the size of the redshift bin should be increased to 
give 'statistical power' to the low transmission segments within a line of sight. 

On the left panel of Figure (13) we plot the discrete flux distribution (DFD) in constant 
logarithmic bins {AlogF — 0.16) in the redshift interval z=5. 95-6.45, in the LRES case only. 
The curve was computed by averaging the number of pixels from all lines of sight in each 
bin and hence the presence of error bars at large flux values. The curve was then normalized 
to the total number of pixels. For reference, we overplotted the mean-flux (solid-line) and 
the upper-bound extreme of the Becker gap (shaded area). The graph shows that low 
transmission regions have the largest probability of detection at high redshifts. We can 
then compute from the graph, that flux values less than the upper-bound of the Becker-gap 
have a probabihty of ~ 80% of being recorded in our synthetic spectra. However, the 
mean flux computed in the redshift interval and represented by the dashed line, lies in high 
transmission flux bins. In the right panel of Figure (13) wc plot the mean transmitted flux 
in each of the flux bins of the DFD curve is plotted against the mean flux in each bin. 

In addition to the discrete flux distribution curve we also compute the Cumulative 
Distribution of transmissions (Ranch et al. 1997), following Songaila and Cowie (2002). 
On the left panel of Figure (14) we plot the Cumulative LRES-Flux Distribution (CFD) 
for the redshift bins (from bottom up) (4.25-4.75), (4.95-5. 45), (5. 45-5. 95). In addition, 
we include the DFD redshift bin (5.95-6.45) which samples the transmitted flux close to 
the reionization tail. The data points for the flrst three bins are extracted sample values 
from the observed CDF-curves in Figure (8) from Songaila and Cowie (2002). For each 
transmitted flux threshold (x-axis) we average the fraction of the flux that lies below 
the threshold value (y-axis) from all lines of sight and compute the standard deviation. 
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The shaded areas are included between the ±2(7 curves. There is a general agreement 
with the observed data in the low and medium transmission region. We believe that the 
disagreement in the high transmission end of the x-axis is due to the sensitivity of the CFD 
to the extrapolated continuum in the observed data. 

The right panel of Figure (14) is an x-axis logarithmic zooming into the CDF profile of 
the last redshift bin. Similarly to the right panel of Figure (13) we overplot the Becker-gap 
mean flux and upper-bound extreme. The curve shows that ~ 80% of all the flux pixels in 
the redshift interval z=5.95-6.45 are below the Becker upper-bound extreme. 

6. Dark Gap Distribution 

In the previous sections, we have determined that high transmission regions in the 
Lya forest at large redshifts become rare close to the epoch of reionization. Therefore an 
alternate method to analyzing the statistical properties of the transmitted flux is instead 
the distribution of dark gaps (Croft 1998). By convention a transmission gap is defined as a 
contiguous region of the spectrum with r > 2.5 over rest-frame wavelength intervals greater 
than 1 A. The gap distribution is then the frequency distribution function of the gap 
wavelength width (GWW). Songaila & Cowie (2002) obtained the gap distribution in the 
Lya region of the high redshift quasars BR 1202-0725 and SDSS 1044-0125 in four redshift 
intervals (3.5-4.0), (4.0-4.5), (4.5-5.0) and (5.0-5.5) which showed a slow variation below 
z=5.0 and a rapid increase in the number of gaps at z > 5. In this section, we repeat their 
analysis using our synthetic spectra samples adding two higher redshift intervals, (5.5-6.0) 
and (6.0-6.5). 

In Figures (15,16) we plot the gap width distribution (CWD) against the GWW in 
constant logarithmic bins of size 0.25. The number of gaps per GWW-bin is averaged from 
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all lines of sight and then normahzed to the redshift path, AX — | 



where and arc the upper and lower bounds of the redshift interval. Overplotted in 
Figure (15) are the observed data (diamonds) from Songaila & Cowie (2002). The vertical 
lines are the observational error bars. If no diamond-points are associated with a vertical 
line then what is plotted is an upper bound estimate. In order to compare with the observed 
redshift evolution of the GWD, we consider contiguous gaps where the Ly/5 optical depth 
also satisfies the dark-gap threshold {Tiyp > 2.5). The Ly/? optical depth is derived from the 
sum of the direct Ly/3 absorption and the Lya absorption at the redshift 1-1-2;^ = ^(1-1-2;) 
(Songaila & Cowie 2002). 



where ■'/^y'^ ^yf^ — 0.16 is the ratio of the product between the oscillator strength and the 



resonant scattering wavelength for Lya and Ly(3. In Figures (15,16) the red symbols 
(triangles: FRES, crosses: HRES & squares: LRES) show the computed GWD distributions 
There is a general agreement with the observations in matching the redshift evolution of 
the GWD. However, our calculations generally predict higher frequency values in the in 
the GWW range 0.25 - 0.75 (in log units) at z=3.5-5.5. The LRES data are below the 
corresponding full/high resolution results for GWW values < 0.75 and therefore tend to 
be in closer agreement with the observed points. The difference between the high and low 
resolution can be explained as follows. The sharp wings of two high transmission regions, 
separating a single gap in the HRES/FRES cases are smoothed out in the LRES case 
and this results in a smaller GWW value for the gap. Small size gaps are more sensitive 
to the convolution process because the spread of the gap is more influenced by the wings 
of the two bounding high transmission regions. Therefore, a reduction in the computed 
GWW value for the small size gaps effectively causes the translation of the FRES/HRES 




(5) 




-30- 



distribution leftward. However, due to the cutoff value of 1 A in the GWW size some gaps 
would be dropped in the LRES GWD calculation. 

All computed GWD curves have power-law segments but also exhibit a power-law 
break below a redshift dependent GWW value. When compared to the observed results in 
the bin z=4.5-5, which have a clear power law distribution, the power-law portion of the 
computed GWD has a steeper slope. Our calculations essentially mean that we detect more 
smaller size gaps than observed and most hkely fewer larger size gaps. We beheve that 
this is due to our limited box size. Due to our fixed integration redshift step {6z = 0.1), 
each line of sight would cross the simulated volume a number of times before progressing 
to the next redshift dump. Therefore, there is a non-zero probability that a number of 
different lines of sight could register a resonant absorption feature from the same cosmic 
neighborhood. Thus, an excess in the number of gaps of a particular size, when compared 
to the observed value, may be an effect of oversampling. On the other hand, the size of our 
simulated volume does not adequate sample the high end of the mass power spectrum. In 
massive objects, like large galaxies, the large abundance of neutral hydrogen would register 
as large dark gap features in a transmission spectrum. The absence of such features in the 
computed GWD curve could be due to the lack of such objects in the volume or that, our 
sample of lines of sight simply missed them. 

Despite the disagreement on the exact shape of the GWD profile, our calculations 
reproduce in general the redshift evolution of the gap distribution inferred by observations. 
The slow evolution toward larger gaps between z=3. 5-5.0 is followed by a rapid change in 
the GWD in observed and simulated data ahke. Between z=5-5.5, both relay an increase 
in the total number of gaps. Specifically, the number of gaps with GWW values > 0.75 (in 
log units) exhibit a sharp increase which causes the gap distribution to flatten. Moreover, if 
we take into account the observed upper limit error estimates then the observed power-law 
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profile featured in the previous redshift bin evolves to one similar to our calculation. 
However, we are compelled to note that the decrease of the observed gap distribution in the 
smallest GWW bin (1 — lO"'^^ A) between z=5-5.5 does not occur in our synthetic spectra 
until z > 5.5. That may be due to comparing between results sampled and normalized 
by radically different number of lines of sight (2 versus 75) or that the reionization epoch 
inferred by the observed lines of sight occurs earlier than in our setup. 

Figure (16) shows that between z=5.5-6 the larger size gaps continue to grow in 
numbers while the smaller ones increasingly disappear. This "gap size reshuffling" trend 
continues in the last redshift interval which also includes gaps with sizes > 100 A. In 
addition, we plot in Figure (17) the total number dark-gaps per redshift path which 
decreases at 2; > 6 after reaching a peak at 2; ~ 5.5. This evolutionary trend of the GWD at 
high redshifts is more profoundly shown in the LRES case. A narrow "high transmission" 
region separating two adjacent gaps in the full/high resolution cases maybe convolved to a 
profile below the dark fiux cutoff (< exp{—Tcutoff)) in the LRES case. This would manifest 
as a merging between the individual gaps, which would disappear in the LRES sample, and 
the creation of a single larger size gap. The LRES case simply illustrates more profoundly 
how the "gap size reshuffling" occurs in all spectral resolution cases. As the redshift 
increases towards the reionization phase, not only the frequency but also the amplitude of 
high transmission regions decreases. Therefore, when the transmitted flux in a previously 
"high transmission region" drops below the dark flux cutoff, the adjacent gaps merge into a 
single one of a bigger size. This may explain why small size gaps rapidly disappear while 
large gaps rapidly appear in the GWD between z=6-6.5. 

The rate of the gap size increase is shown in Figure (18) where we plot (left panel) 
the mean GWW against the mean redshift in each interval. We note the accelerated rate 
of increase at redshifts ^ > 5.5. It is evident that the slope would have been steeper 
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if the results were plotted against cosmic time rather than redshift. The evolution 
progresses toward the reionization phase linearly with redshift up to 2; ~ 5.25 followed by 
a steep power-law type increase. To infer a power-law profile we used cubic interpolation 
between the data to acquire more points. Two least square fits are shown in Figure (18). 
A linear regression fit to the data at 2; < 5.25 yields the following parametric fit, 

< GWW >= aoz + ai, where = -1.04(-1.67) ± 0.34(0.48), 04 = 0.70(0.87) ± 0.08(0.11). 
The numbers in the parentheses refer to the low resolution data (dashed curves in Figure). 
For simplicity we treated the FRES and HRES cases as identical and used the full 
resolution data only. At 2; > 5.25 a power-law fit in the form < GWW >= [j-)^ yields 

h = 8.874(10.089) ± 1.095(1.282) and Zo = 4.807lgi^I(4.82llg:i^^). We reported our fits in 
high precision because the profiles are very sensitive to the exact values in a linear-linear 
plot. As before, the parentheses refer to the LRES data. The normahzed to the mean 
standard deviation {{Variance)^) (right-panel) exhibits a similar evolutionary trend. A 
linear evolution up to 2; ~ 5.25 is followed by a rapid increase under a strong power-law 
^<gWw> °^ z^'^^) for larger redshifts. As a result, the la deviation becomes comparable 
to the mean GWW at about z ~ 5.75 which reflects the broadening of the GWD in 
the two high redshift intervals (Figure (16) Left-Panel). The extrapolation of our crude 
fits to z=6.9, which within 5z — 0.1 corresponds to the beginning of reionization in our 
setup, yields an estimate of the statistical mean value for the gap wavelength width of 

< GWW >= 37.2 ± 90.7 A (LRES fits). In comparison, the Lya transmission at z=6.9 in 
a redshift interval of 5^ — 0.1 corresponds to A A — 121.6 A which is less than la from the 
mean gap width. In effect, we can characterize the entire spectral region at that redshift as 
a "trough" and infer that the IGM is in effect neutral. 

Our calculation begun by making an educated guess of the initial redshift and the 
profile of reionization which allowed for the derivation of the IGM's global ionization 
properties and evolution. Even though synthesizing spectra at small RCP (Reionization 
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Completion Parameter) values was impractical with our method (Zhang et al. 1997), the 
extrapolation of fits to the redshift evolution of the mean gap width allowed for a consistent 
determination, within the statistical errors, of when reionization begun. Therefore, the 
evolution of the mean gap width can be used as an alternative method of reionization 
mapping. The clear advantage of gap statistics over using the mean transmitted flux at 
redshifts right after or during reionization, is that the MTF is biased by high transmission 
regions, in real and redshift space alike, that become increasingly rare and narrow as the 
redshift increases. Any conclusions drawn in that case will be affected by small number 
statistics (too few hues of sight) and the large flux variance as we discussed in previous 
sections of this work. On the other hand, as the dark gaps grow in size they include an 
increasingly larger portion of the local optical depth along a line of sight and therefore 
volume fraction of the IGM, which is significant if we are to make any claims about the 
IGM's global ionization state. Gaps also have the advantage that they arc insensitive to the 
exact flux values of the high transmission regions that bound them. In addition, since the 
measurement of transmission values below the cutoff is more probable as we approach the 
reionization phase, we expect that the line of sight variance will not be as important factor 
as it was in examining the MTF properties. Nonetheless, a disadvantage of the gap size 
analysis we have performed so far is that we can not directly infer quantitative properties of 
the underlying baryon distribution inside the gaps since all opacity information is reduced 
down to a single optical depth value, the optical depth threshold. For the remainder of this 
section, we investigate a possible correlation between the size of a gap and some measure of 
the gap's optical depth. The motivation is that if such association exists it is a measure of 
the coupling between dark gap statistics (size) and transmitted flux statistics (amplitude). 

One choice is the arithmetic "mean pixel optical depth" within each gap region. In 
Figures (19,20), we scatter plot the mean Lya against the mean Ljf3 optical depth for 
each gap with wavelength width spread larger than 1 A and for mean pixel optical depth 
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greater than 2.5. The slope of the straight hne is equal to the ratio ■{^''-'^^^^ — 0.16. The 
color bar on Figure (19) shows the allocation of color for each pair of optical depths based 
on their measured gap width. The points were sequentially plotted from the smallest to 
the largest GWW value and therefore each color represents the largest gap value measured 
in the locale of a mean optical depth pair < (tl^/j >, < T^ya >)■ The solid colored regions 
on the graph represent the "exclusion zones" where both the Lya and Lj(3 optical depths 
are smaller than 2.5. Since every pixel across a contiguous gap has optical depth above the 
cutoff threshold, {Tcutoff > 2.5), the average pixel optical depth in a gap is also above the 
same cutoff value. This measure of a gap's optical depth would be heavily biased by the 
presence large pixel optical depth values and is equivalent to associating the size of a gap 
to the opacity of the most underionized region(s) (largest overdensities) it contains. 

The scatter of the data in Figures (19,20) is due to the second term, Ta{zp), in 
Equation (5) which we used to compute the Ly/3 optical depth. The data points are 
scattered upwards, along the y-axis and off the constant slope line because of the 
contribution to T0{z) from Ly/5 photons redshifted to the Lja frequency at a later redshift 
Zi3. For Ta{z) values between 15.625 - 2.5 the second term needs to contribute at least 
between 0-2.1 optical depth units respectively to Tp{z) for the pixel to be part of a "dark 
gap". For Ta{z) > 15.625 all associated Ly/3 pixel optical depths are above the optical 
depth cutoff. Since the numbers of pixels with Ta{z) = 15.625 — 2.5 increases as the redshift 
decreases, the identification of "dark gaps" fitting all the constrain parameters becomes 
increasingly more dependent on the value of the rQ(z^) term. On the other hand, as 
the redshift decreases the average IGM opacity at zp < z is smaller (xHiiz/s) < Xhi{z)) 
while the baryon mass clumping is larger {Cp{zi,eta) > Cp{z)). Consequently, it becomes 
increasingly rare for the redshifted Ly/3 photons to be resonantly scattered from a high 
opacity region. On average, the contribution of the Ta{zp) term decreases with redshift 
and that results in an increasing number of "off-slope dark gap" candidates being flagged 
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out. The above argument explains why the scatter of the averaged optical depth gap pairs 
decreases with decreasing redshift. 

We can measure the scatter off the constant slope line by computing the median ratio 
between the mean Ly/S and Lya gap optical depths at each redshift interval. It is difficult 
to gauge such relation from Figures (19,20) since in each locale of mean optical depths 
we can only visualize the largest gap size. Any information of the scatter plot density is 
effectively masked out by the size of the plotting symbols. For comparison purposes, we 
will also introduce an alternate method of measuring the optical depth properties of a 
gap. That is the gap "effective optical depth" which is computed by the negative natural 
logarithm of the transmitted flux averaged within the bounds of the dark region. Contrary 
to averaging pixel optical depths which is sensitive to large values and therefore biased by 
the least ionized regions, this approach relates a gap to the opacity of regions ionized the 
most within the gap's bounds. 

The effective optical depth of a spectral region can be associated to a characteristic 
overdensity value where Tg// = ta=i A^. In this equation, ta=i = i4:g~^ (^)^-^ is the 
optical depth at the cosmic mean density and /3 is a function of the adiabatic index 7 
(Appendix I, II). On the left panel of Figure (21), we plot the redshift evolution of rA=i, 
between z — 4 — 6.45 for different choices of the normalized ionization rate, along with 
Teff from Figure (6). The latter is consistently below the ta=i curves which indicates 
that it corresponds to overdensities A < 1. If we overplot about the effective optical 
depth the evolution of rA=o.36, for three adiabatic index values using the ionization rate 
inferred by the Equation (3) fit to our data {g-^'^^), we see that the three curves contain the 
redshift evolution of the effective optical depth. Therefore, the latter can be associated to 
a characteristic overdensity A^ ~ 0.36. The exact value scales with redshift as shown on 
the right panel where we compute Ac from Teff — ta=i A^ in the redshift interval z=4-6.4. 
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Irrespective of the normalized ionization rate type or adiabatic index, which are hsted on 
the figure, the characteristic overdensity Ac associated with the effective optical depth 
decreases with increasing redshift. For 7 = | and g = 51-^**, Ac scales between 0.42 to 0.32 
at z—A and z—dA respectively. In conclusion, the effective optical depth is biased by the 
highly ionized low overdensity regions. 

The previous statement is based on properties of the entire spectrum regardless of an 
optical depth cutoff. Nonetheless, we can apply the method to the transmission fraction 
sampled by the dark gaps. Since that fraction increases as we approach reionization, it is 
expected that the characteristic overdensity associated with the gap effective optical depth 
at high redshifts to closely match the one inferred by all transmitted flux pixels. We first 
compute in Figure (22) the redshift evolution of the scatter between the mean optical depth 
and the effective optical depth for each dark gap in the redshift intervals we considered. 
The two types are not correlated for mean optical depths larger than fgap ~ 15 at 2; < 6 but 
are positively correlated at z > 6. They are also positively correlated for fgap < 15 at all 
redshifts. Wc then proceed to convert the scatter plots in Figure (22) to the scatter plots 
in Figure (23) between the characteristic overdensities Ac estimated by Ac — {:ffz^Y^^ for 
7 = 4/3 and r = fgap (mean optical depth) or r = rf^J. The figure shows that the effective 
optical depth is biased by smaller overdensities than the mean optical depth. In the redshift 
range approaching reionization the characteristic overdensity inferred by the effective optical 
depth of the dark gaps is entirely in the underdense (A < 1) range. An overdensity average 
in the last redshift interval (6-6.5) yields 0.42 ± 0.08 which is comparable to ^ 0.34, the 
value derived from Figure (21) at z^in = 6.25 for the same 7 and choice of the normalized 
ionization rate. For comparison, the overdensity average inferred from the gap mean optical 
depth in the same redshift range is 1.28 ± 0.78 which statistically samples the cosmic mean 
density. 
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It is evident from the preceding analysis that the two gap optical depth definitions 
offer two competitive perspectives on the properties of the underlying baryon distribution 
sampled within the dark regions and therefore together they yield a more complete 
description of the gap opacity evolution. We can now plot on the left panel of Figure (24) 
the evolution of the median ratio between the mean Ly/3 and mean Lya optical depths 
from Figures (19,20) (blue histogram). In addition, we also plot (red histogram) the 
median ratio if we instead use the effective optical depth of a gap. As expected, the 
redshift evolution of both histograms show a progressive reduction of the median ratio at 
smaller redshifts which illustrates the decrease of the data-scatter off the constant slope 
line observed in Figures (19,20). The asymptotic value of the ratio in the mean optical 
depth case approaches 0.16 by z—3.5 as predicted by Equation (5). Both computed ratios 
show that the LyP optical depth is an important selection bias in the flagging process of a 
low transmission region as a "dark gap". According to Figure (24), at redshifts following 
reionization completion, the mean (effective) gap Ly(3 optical depth can be as much as 
~ 40% (~ 80%) that of the mean (effective) Lya optical depth. A detailed examination 
of the redshift proflles in each case, which primarily depend on the spectral slope and 
evolution of the UVB (Songaila & Cowie 2002), is beyond the scope of this work. We can 
however, estimate their relative scale averaged across the redshift range. The evolution of 
the gap mean optical depth yields an average median offset TLy/s/fLya — 0.16 ~ 0.14 between 
z=3.5-6.5. In addition, we can postulate that during the post-reionization smooth evolution 
of the IGM opacity, the offset in the effective optical depth ratio can be approximated by 
^eff/^eff - 0-16 ~ 0, where = T^ff{zp)/T^ff{z) = = i^y. Substituting for the 

power law exponent q = 4.16 from Equation (2) we can then estimate that '^eff/'^'eff ~ 0-66 
and an average scale between the two median rations of 0.66/0.3 ~ 2.2 which is consistent 
with the median histograms in Figure (24). 
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We have shown in Figure (22) that in general, t|£^ < fgap at all redshifts. This is also 
illustrated on the left panel of Figure (24) where the median ratio of Ly/3 to Lya optical 
depths is larger in the case of Tgl^. In scatter plot layouts similar to Figures (19,20), points 
corresponding to pairs {j^^ iT^J^) will be distributed in regions further to the left of the 
constant slope line when compared to fgap pairs. However, as we shall shortly show, the 
Lya-Ly/? distributions for r|/j^ lack an important attribute that the fgap distributions have. 
In Figure (20), the scatter plots for the last two redshift intervals, gaps with sizes larger 
than 100 A (red color symbols), measured after z = 6 in our setup, correspond to mean 
Lya optical depths larger than Ti^y^ ~ 100. This suggests a correlation between the gap 
size and the gap mean optical depth which is in fact illustrated by the way the data are 
plotted. The sequential color plotting from smaller gap sizes to larger ones reveals that 
there is a trend in the color scheme(size) to move to the right along the x-axis (Lya mean 
optical depths) and upwards along the y-axis (Ly/9 mean optical depths), in other words 
toward larger mean optical depth values. In order to measure this trend, we plot on the 
right panel of Figure (24) the redshift evolution of the "Pearson Correlation Coefficient" 
between the distributions of the gap Lya mean optical depth (r^ap) and wavelength width 
(GWW). We note a weak but positive correlation persistent throughout the redshift range, 
with values between r ~ 0.4 — 0.55 (blue histogram). The same calculation between t^I^ 
and GWW yields a positive but much weaker correlation (r ~ 0.16) in the last redshift 
interval only. For 2; < 6 the gap effective optical depth and size are uncorrelated. This 
result is not a surprise because in Figure (22) we showed that there is little correlation 
between t^I^ and fgap at 2; < 6. Finding a positive correlation between fgap and gap size 
precludes a positive correlation between t|/^ and gap size. The two optical depth types are 
largely uncorrelated because the mean optical depth is sensitive to a more extended range 
of baryon overdensities. As the dynamical range of overdensities becomes smaller at large 
redshifts the correlation improves. 
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Intuitively, the result on the right panel of Figure (24) can be understood in terms of 
the opacity averaging procedure in each case. The mean optical depth method gives equal 
statistical weight at each local optical depth value along the gap. Therefore, the bigger the 
size of a gap the more extended the dynamical range of opacities sampled and consequently 
the direct mean being a "point-estimator" of all opacity data within the gap will reflect 
that. Hence, the two quantities will be to some degree correlated irrespective of redshift. 
On the other hand, the effective optical method gives equal statistical weight to the local 
flux. The optical depth estimate in this case is biased by a fraction of the gap, the regions of 
lowest opacity associated with the low end of the overdensity range sampled along the gap. 
Hence, the total size and the effective optical depth of a gap are in principle uncorrelated. 
However, the approach of reionization in the last redshift interval shifts the overdensity bias 
in this method entirely in the underdense range, shown in Figure (23), which dominates the 
volume filling factor. Thus the segment of the gap to which the effective optical depth is 
sensitive to rapidly increases to a size comparable to the gap width, hence the sharp change 
from the uncorrelated profile at 2; < 6 to a weak but positive correlation at 2; = 6 — 6.5. A 
similar jump of the same magnitude is also evident in the blue histogram, however it is a 
more dramatic effect to traverse from completely uncorrelated data to some correlation at 
larger redshifts. 

The exact values of the correlation coefficients in each case will depend on the 
redshift bin-size. The bin-size of A2; = 0.5, which was chosen for constructing the gap 
width distribution, is most likely too large to showcase the transition to high correlation 
coefficients between size and optical depth as we cross into the reionization phase. Our 
values in the interval z— 6-6.5 from the histograms on the left panel of Figure (24) are 
probably smooth out averages between large values at 2; ~ 6.5 and small values at 2; ~ 6. 
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7. Conclusions and Summary 

In this work we investigate properties of the Lya transmission at redshifts during and 
after reionization. We have performed a numerical simulation of the Lya forest where 
ionization of the primary species is due to a homogeneous UVB (Fig. 1,2) which we switch 
on at a redshift that is consistent with numerical and observational results which place the 
era of reionization at z > 6.5 (Fig.3). We synthesized noiseless synthetic spectra along a 
fixed number of lines of sight through the simulated volume and measured the Lya flux 
transmission between z=6.6-2.5 (Samples shown in Fig.4). In addition to the full pixel 
resolution spectra we obtain two more by convolving our spectra down to the HIRES Keckl 
and the ESI resolution values. 

Because of limits placed by our finite box size, we made no effort to explicitly match 
high redshift data, rather study a reahzation of the Lya forest consistent with the results 
by Songaila (2004), that asserts a smooth profile for the effective optical depth beyond the 
Becker trough. The specific input parameters for our calculation were chosen to match low 
and intermediate redshift data derived in previous simulations (Jena et al. 2004). Our 
reionization profile is for all practical purposes artificial. It serves the purpose of measuring 
the effect a reionization phase has on observable quantities by it's mere presence in the high 
redshift universe. We use the Reionization Completion Parameter (RCP) as a dimensionless 
variable to quantify the degree of reionization based on deviations from a smooth profile 
of the mean HI ionization fraction. In doing so we introduce a visualization technique to 
probe the fraction of the reionization profile that our observable quantities sample (Fig. 8). 

By placing the beginning of reionization at z=7, we achieve full reionization by ^ ~ 6.4 
and derive a smooth power law profile for the effective optical depth from low redshifts up 
to the tail of reionization (Fig. 5, 6). The smooth profile of our data between z=4-6.4 can also 

be fitted by the Songaila & Cowie (2002) analytic parametrization of the mean transmitted 
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flux which was based on expanding the uniform IGM optical depth formulation to a 
clumpy baryonic distribution (Fig. 7). The inferred slope for the normalized photoionization 
rate is shallower than the fit to the observed data however the inferred photoionization 
rate is consistent within errors to the one determined by our input UVB and computed 
temperature distribution (Fig.7). 

As we enter the reionization tail, the mean transmitted flux (MTF) diverges away 
from the profile inferred from earlier redshifts which is the standard method of asserting 
the beginning of the reionization tail. However, we found that the large margins of error 
due to the inter-related effect of the cosmic and line of sight variances can be detrimental 
in the degree of certainty behind claiming a reionization era measurement at low RCP 
values, especially when high resolution spectra are used (Fig.8). A measurement within the 
reionization tail can be statistically excluded if the margins of error to the MTF include the 
extrapolated smooth profile from earlier redshifts. 

In Section (4.3) we studied the evolution of the flux variance and found that the 
variance to the MTF is inherently related to the mean line of sight of variance (Fig. 10, 11). 
Each line of sight variance is a measure of the cosmic variance along that direction, which 
increases as the IGM becomes more neutral. However, the increased variance along a 
line of sight introduces an increasing with redshift uncertainty in measuring the mean 
flux value and that error is carried over to the MTF computation, which is the average 
over all line of sight mean fluxes. In the end, the ability to determine a highly certain 
value for the MTF at high redshifts is inadvertently degraded as is indicated by the small 
kyrtosis value of the LOS-mean flux distribution within the reionization tail (Fig.ll). 
The only way to statistically measure the MTF value with a relatively small margin of 
error is to use an extraordinary number of lines of sight which is highly unlikely to be 
obtained observationally (Fig.ll). Therefore, we conclude that the uncertainty in the mean 
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transmitted flux is unavoidable. The problem is compounded by the use of a mean flux 
analysis at high redshifts. 

The mean transmitted flux is a quantity that is biased by high transmission regions 
and the contribution of low transmission or dark segments of the spectrum is diluted away 
by the exponentiation of the local optical depth. This is an important issue at high redshifts 
where such high transmission regions become rare. However, their presence along a line of 
sight will result in a mean flux calculation which will closely resemble the magnitude of the 
high transmission region (or few high transmission regions). This is the basis of the distinct 
probability of having to contend with a few large LOS-mean flux values within our sample 
of random lines of sight. Their subsequent contribution to the MTF will also skew the 
results toward a larger value when compared to the result obtained if such high transmission 
LOS were to be excluded (Fig. 12). Because that skewed value enters the calculation of the 
MTF variance, the deduced margins of errors will also be large and that flnally will yield a 
picture that is statistically close to a smooth proflle at a redshift when we are clearly within 
the reionization tail. 

A much clearer association between the properties of the transmitted flux and 
reionization is derived when instead of the MTF we use the mode (most frequent value) of 
the LOS mean fluxes (Fig. 12). In doing so, we effectively exclude rare high transmission 
measurements at high redshifts. Unfortunately, what is more probable or less probable 
in the high redshift universe does not directly translate to the ability of observations to 
do measurements of the transmitted flux. In fact, due to noise and the selection bias 
of sky objects, there is a higher probability to observe a high transmission line of sight 
rather than a low transmission one which are more frequent but harder to measure. In the 
end, our conclusion is that observing a simultaneous steepening of the redshift profiles of 
the MTF and the flux variance (irrespective of deflnition, type and spectral resolution) 
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relative to an extrapolated smooth evolution solidly points to the time when reionization 
terminates. However, the measurement of global properties of the IGM and the UVB 
during reionization depends on the profile of reionization itself which is difficult to derive 
because of the intrinsic high degree of uncertainty of the data. Another way to look at 
our result is that each line of sight appears to correspond or effectively is an attempt to 
measure it's own unique reionization profile along it's redshift path. The difficulty lies in 
ascertaining a single global reionization profile even if one is provided. We have come to 
this conclusion even when a uniform reionization UVB was used as an input. However, 
the assumed homogeneity of reionization through the global average of the IGM opacity 
is a stretchy assumption. Even if the UVB is uniform, the network of chemical reactions 
"operates" on an inhomogeneous baryonic distribution. The resulting opacity distribution 
is non-uniform and depending on the scale at which we view it, inhomogeneous as well. 

In Section (5) we also examined the properties of the flux distribution at high redshifts 
(5.45-6.45) (both discrete and cumulative) which shows that most of the transmitted 
flux (~ 80 %) lies at values below pa 6 x 10^'^ or the upper limit of the Becker (2001) 
trough. In contrast, the mean transmitted flux at these redshifts lies in the top 20 % of 
the flux distribution (Fig.13,14). This suggests that the mean flux at high redshifts reflects 
the opacity properties of a portion (volume fraction) of the IGM which becomes smaller 
as we approach reionization. Therefore, even though Becker-type troughs dominate at 
high redshifts, a mean transmitted flux analysis fails to register their contribution. Our 
conclusion is that the traditional method of measuring properties of the Lya forest through 
the MTF is deficient and needs to be supplemented by an analysis that directly involves 
low transmission regions. 

In Section (6), we studied the distribution of "dark gap" sizes and it's evolutionary 
properties which becomes relevant at high redshifts because an increasingly larger fraction 
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of the IGM is sampled within them. When compared to the observed data of dark gap sizes 
our results differ marginally in the exact shape and extent of the distribution (Fig.15,16). 
That maybe due to the fact that the observed data we considered consisted of only two 
lines of sight, when in this work we took into consideration all of our 75 lines of sight. 
Nonetheless, the redshift evolution towards reionization, from small to large redshifts, 
is consistent with the one deduced from observations. The results show an accelerated 
"creation" rate of larger size dark gaps at high redshifts (Fig. 15). The inclusion of two 
higher redshift intervals, which we did not have any observed data to compare to, reinforced 
the previous conclusion (Fig. 16). We suggested a mechanism of gap merging as a possible 
explanation of the occurrence of larger gap sizes as we approach reionization and measured 
the redshift evolution of the average gap width (Fig. 18). Similarly to the mean transmitted 
flux analysis, the mean gap width and the error introduced by the increasing with redshift 
spread of the gap distribution steepen as the profile approaches reionization redshifts. 
Crude fits of the computed results to a power law functional form and extrapolation to 
higher redshifts yields a mean gap size that with the error is comparable to the size of the 
transmission spectrum. The redshift at which the above occurs coincides with the beginning 
of reionization in our setup Therefore, we conclude that such an extrapolation of the mean 
gap width to high redshifts, offers an appealing method in pinpointing the time at which 
reionization begun. 

Finally, we found that there is positive correlation between the gap size and the mean 
optical depth of the pixels it contains (Fig. 19,20,24). Such correlation couples the size 
of dark regions to the underlying opacity distribution. Therefore, we conclude that by 
combining the analysis of the mean transmitted flux and dark gap sizes one can infer the 
properties of the entire IGM structure sampled by a line of sight through the cosmic volume. 
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9. Appendix I 

Standard calculation of an analytic approximation of the Lya optical depth begins 
with the HI photo-ionization equilibrium condition that relates the number density of HI to 
the photo-ionization rate (Tri), the recombination coefficient a{T), a function of the gas 
temperature, and the baryon density of HII under the assumption of the electron abundance 
being determined by hydrogen ionization only: 



a{T) Cl{T) 2 2 '^(^) 2 fr\ 

^Hi = -j^nenHii ~ —^{nHii) ^ uhi = —^Uhu = 'I^Phii (6) 



In a uniform IGM after reionization pnii ~ pn — —Ps — YrPb, where Yh is the 
cosmic hydrogen abundance and pb = ps is the baryon cosmic mean. Integrating the 
optical depth along a line of sight from redshift z to the present yields the uniform Lya 
optical depth expression = 14g~^{^-^)^-^, where g is the normalized photo-ionization rate 
given in Equation (3). A non-uniform IGM approximation can be obtained by assuming 
that Phii ~ Ph and integrating the rms vahic < pfj > of the 3D hydrogen distribution 
along a hne of sight. Since < pjj >— < p'^^ >= Y^'p^'^Cb, where Cb —< S'^ > is the 
baryon clumping factor, the non-uniform IGM Lya optical depth is then TLy^ — t^Cb- 
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The condition Cb — i recovers the uniform expression. An expansion can be obtained 
(Hui & Gnedin 1997, Croft 1998, McDonald & Miralda-Escude 2001) by assuming a 
power law dependence of the local optical depth to the local overdensity ta — TuA^ where 
P — 2 — 0.75(7 ~ !)■ The mean transmitted flux. Equation (2), can then be computed 
(Songaila & Cowie) by integrating F — J P(A) exp(— ta) dA where -P(A) is the baryon 
distribution function (Miralda-Escude, Haehnelt & Rees 2000). 

The basis for the above approximations is the assumption pnu fa p^. According 
to the top-left panel of Figure (8) this is good global approximation even during the 
late stages of reionization (> 90%) where the volume averaged neutral fraction decreases 
from ~ 10~^-^ at 2; > 6.5. However, the volume averaged neutral fraction of hydrogen 
is biased toward low overdensities which dominate the volume filling factor. At large 
over densities, the approximation will not be as valid and therefore there will be a 
discrepancy between the distribution of HII in respect to the underlying baryons. In the 
context of homogeneous reionization that discrepancy is less significant when compared 
to inhomogeneous reionization with self-shielding but it's effect can not be ignored at 
times close to reionization. In our simulation the ratio of HII to Baryon clumping factors 
scales between 0.5 at z — 6.5 and 0.95 by z — 6.2. Because the previous optical depth 
functional forms do not account for this "chemical" effect we do not allow the approximation 
< p%jj >^< Ph > but still use < pnii >~< Ph >■ 



where the ratio ( ^<pg>^ )^ can be approximated by Y^. Therefore, the previous equation can 

be written in the form < uhi >= ^f:YHCBpBCHii/CB — > r = TuCb^^§^- The argument 
we make is that if the FGPA equation was the basis for expanding r = TuCb ta — t„A^ 
then to account for the difference in the clumping factor between the baryon and HII 
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distributions at redshifts close to reionization we assume that Equation (2) applies under 
the expansion r = r^Cs^ ^ r^^A'^, where = Ug-^^{^^)^-^. The ratio of the 
clumping factors is then effectively absorbed in the normalized ionization rate. 



10. Appendix II 

The mean transmitted flux, Equation (2), was computed in Songaila & Cowie (2002) 
from the integral 

A — - 

< F y P(A) expi-TA) dA^ I AA-'dA exp{-TuA^ + (8) 

where ta = t^A^, r„ is the optical depth in a uniform IGM, {3 = 2 — 0.75(7 ~ 1) ^-^^ 
5o — 7.61(1 + z)~^. -P(A) is the functional dependency of the volume density distribution 
(Miralda-Escude et al. 2000) on overdensity A and h ^ 2.5. The integration via the method 
of steepest descents (Songaila & Cowie 2002) yields a general functional form for the mean 
transmitted flux: 

<F>= A{^^r-%Al/'-' X exp[-{^ + l)A-'/X'] (9) 

where Aq — ( 2t i35'^ ) ^"""^^^ value of A at which the exponent in Equation (8) sharply 

peaks. The function F(g,z) is then derived through the dependence of t„ on the normalized 
ionization rate g, = 14:g~^{^-^)^-^. The mean square value of the transmitted flux is 
similarly expressed by integrating exp{—2 x ta) instead of exp{—TA) over the volume 
density distribution function. We can view the multiplication factor of 2 as a modifier 
of a constant (r^) in the integration and therefore we simply rewrite Equation (9) with 
A„ = A„(|)V{/?+4/3) instead of A„. 

4 

< >= / AA-^ dA exp{-2r^A^ + ^) ^ Ai^^^fXAl/'-' (10) 

xexp[-{^ + l)A-'/%'] (11) 
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Substituting for the functional forms of < > & < F >^ in the variance definition, 

Var = 10 - 1, we get Var = -1 + [A(3|^)0-55J-i(i)(5/3-&)/(/3+4/3) ^h-^/z exp[{^ + 
1)5-2 A;^/3(2 -2V(i+3/3/4))]_ If get = (|)^ and qi = < 1 then the previous 

equation becomes: 

Var = -1 + qM (-^f" So Af exp(-( ^ + ^)5;2A;V3 2(1 _ q,))]-^ (12) 

where the quantity 2(1 — Qi — 0.85 — 0.77 between an isothermal (7 = 1, = 2) and an 

adiabatic (7 = 5/3, /3 = 1.5) equation of state respectively. We can simplify Equation (10) 
if we substitute in the expression for the mean transmitted flux to get: 

Var = -1 + qo[A i^^f A^/^-^"''^^ F^^'^^'^^ = -1 + c<,(l + z^F'^' (13) 

where Co > & C2 = 2(gi — 1) < 0. The term {1 + zY^ is derived from the power-law 

dependence of the 5o (x {1 + z)'^ & A^ oc g'^-^ x (i±^)-°-^5 (Songaila & Cowie 2002) in 
addition to a power-law assumption for the normalized ionization rate, g — bo{l + z)'"^. 
The exponent then becomes Ci = (1 — 2gi)[(5/3 — 6)(0.36i — 0.75) — 1]. Substituting for 
b ~ 2.5 & 61 = —0.91, the power law exponent inferred from our data in the redshift range 
z = 4 - 6.4, we get ci = 0.021 - 0.033 for 7 = 1 - |. As a result, the term (1 + z)"^ ^ 1 
and therefore the variance depends primarily on the value of the mean transmitted flux in 
a cosmic environment. 



11. Appendix III 

Assume a random sample (lines of sight) of N means Xi with measured standard 
(un-normahzed) variance af. The standard variance of the mean X — j^J^i^i^^ then equal 
to 

Var(X) = yar (1 E = ]^ E ^«-(^^) = ]^ E ^ < > (14) 
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where <> denotes the average over the variance sample. If the variances along all lines 

of sight are equal to a single value (the cosmic flux variance, cTc/, at redshift z) then 
< erf >— cr^j and therefore Var{X) — -jf- In our case VarX = ^ which suggests 
— ^cf- addition, we defined a1—< > and therefore cr^ = (Tm- 
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Table 1. Regression constants in Log (Variance^) = A{0) + A{1) (1 + z) for 
Figure (9)- first four rows- and Figure (lO)-last three rows. RNMLV-VAR refers to the 
renormahzed - to the MTF - mean LOS- variance: {gl/MTFY . 



Redshift Range & Type 




A{Q) ± (7a(o) 


A{1) ± (7A(1) 


[2.5-5.8] TOTAL-MTF-VAR 


3.8 10-2 


-1.627 ±0.050 


0.308 ±0.010 


[2.5-5.8] FRES-MLV-VAR 


2.8 10-3 


-1.459 ±0.013 


0.277 ±0.003 


[4.5-5.8] LRES-MLV-VAR 


1.8 10-^ 


-1.589 ±0.026 


0.286 ±0.004 


[2.5-4.5] HRES-MLV-VAR 


6.3 10-^ 


-1.430 ±0.016 


0.273 ±0.004 


[2.5-5.8] FRES-RNMLV-VAR 


2.4 10-3 


-1.452 ±0.013 


0.274 ±0.002 


[2.5-6.25] FRES-RNMLV-VAR 


2.4 10-2 


-1.532 ±0.032 


0.291 ±0.006 


[2.5-6.25] TOTAL-MTF-VAR 


5.4 10-2 


-1.678 ±0.048 


0.319 ±0.009 
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Fig. 1. — Fits to the redshift evolution of the photo-ionization rate for the three primary 
IGM species (HI,HeI & Hell) derived from the soft spectrum calculation by Haardt & Madau 
(2001) (solid line) for our ACDM cosmology. The softness of the spectrum ( -^Heii 

> 100) 

is due to the UV production in galaxies. The larger number of galaxies in the high 
redshift Universe (compared with QSOs) makes them the principal ionizing sources of neutral 
hydrogen. The dashed line shows the rates computed due only to quasar contributions. The 
falling QSO number counts in the high redshift epoch {z > 4) results in a steep drop in the 
ionization rate. 
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Fig. 2. — As in Fig. 1 where instead of the photo-ionization rate per baryon we show 
the photo-heating rates. The increased rates in the HM01(Q+G) case yield higher IGM 
temperatures for z > A (compared to the QSO only case) which in turn results in a grater 
thermal broadening of the Lya lines. The increased thermal broadening enhances the effects 
of line-blending and the creation of dark gaps in the transmitted flux. 
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Case C Case D 




2 4 6 8 10 12 14 2 4 6 8 10 12 14 

Redshift Redshift 



Fig. 3. — The redshift evolution of the Gunn- Peterson optical depth is plotted for four cases: 
Case AO refers to the quasar-only HM96 UV flux spectrum (open triangles) . Case A makes 
use of the HM(2001) HMOl UV spectrum where Zon = 6.5 (open squares). Cases B,C & D 
are the same as Case A but with Zo„=7.0,8.0 & 9.0 respectively. The choice of the Zon does 
not effect the evolution of optical depth in the redshift range of interest {z < 6.5). 
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Fig. 4. — Extracted segments from a synthetic spectrum along a single line of sight in bins 
of Az = 0.1. The title in each panel is the mean redshift value in the extracted range. We 
have convolved our spectrum with a gaussian at the spectral resolution of R=36,000 for the 
top two panels and R=5,300 for the bottom two panels in order to emulate the observed 
HIRES Keckl data and ESI data at these resolutions respectively. As the hne of sight is 
cast through the time-series of the simulation data-dumps it samples different regions of the 
computation volume. Therefore each panel depicts the local Lya absorption from a different 
cosmic neighborhood. 
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Fig. 5. — Mean Transmitted Flux of the Lya forest as a function of redshift as predicted by 
our simulated data. The solid line marks the averaged flux in 30 redshift bins of Az — 0.153 
(AA 186 A) from all the 75 hues of sight. The small diamonds represent the individual 
mean fluxes from each line of sight. Overplotted are the ESI samples for z > 4.5 and the 
HIRES Keckl samples for z < 4.5 from Songaila (2004) (large diamonds). 
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Fig. 6. — As in Fig. 5 where the mean transmitted Lya flux (red crosses) is converted to 
optical depth. The observed flux lower limit (upper error bars in the optical depth plot) 
is set to 0.0015. The blue lines above and below the effective optical depth represent the 
LOS-averaged extremes in each redshift interval (solid: FRES & HRES; dashed: LRES). The 
solid red line is a power law fit to the effective optical depth. The Becker gap (the highest 
redshift diamond) is within 2.5aLRES (green error bar) from the MTF at Zmean = 6.06 ±0.08. 
The power law fit Te// = 2.ltoil{^y-^^^°-'^^ does not include the optical depth at the last 
redshift bin (zmean = 6.52 ± 0.08). 



-60- 




Fig. 7. — On the left panel we plot the effective optical depth with respect to redshift 
focusing on the redshift interval 4-6.4. The crosses and the solid line are the effective optical 
depth and the power law fit from Figure (6). The dashed line is the fit to the data using 
Equation (3). The fit is good to within 4% error. 

On the right panel, we plot the normalized ionization rate versus redshift in the same interval 
as on the left panel. The solid hne is inferred from Equation (3) under the assumption of g 
having a power law dependence {g — bi {1 + zY"^). The small dashed lines were computed 
from the la error to the scaling factor in the power law (bi). Direct application of simulation 
data in Equation (4) yields the dashed-dot line. The long-dashed line adjusts the previous 
result by the ratio of the HII clumping factor over the baryon clumping factor. 
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Fig. 8. — On the top panels, we show the profile of reionization as it is traced by the mean 
baryon fraction in neutral hydrogen (top-left) between z=5-7. The re-ionization profile (top- 
right) is fitted to an analytical function and the redshifts of re-ionization percentage of 
completion are color-coded. Reionization is 99% complete by z=6.43 (red line) compared 
to 1% completion at z=6.68 {Azj-eion = 0.25). The MTF & variance evolution in the same 
redshift range are plotted on the bottom panels. The Log(MTF) (oc Tg//) (bottom-left) 
evolves under the same power-law redshift profile (solid-straight line) up to z ~ 6.25. It is 
then followed by a 1-dex decrease within 6z = 0.5. Overplotted are the margins of error to the 
MTF (bars: CL=68% open: CL=90%). The right-bottom panel shows the evolution of the 
total MTF-variance (TVAR: solid line) and the Mean LOS-variance (MLV: dashed=FRES, 
dashed-dot=LRES). The straight lines are linear-log fits to the data using redshifts z < 5.8. 
The MLV-variance breaks away from a linear-Log profile at 2; ~ 6.25 as it enters the extreme 
end of the reionization tail. A break in the TVAR curve at z > 5.8 is not statistically 
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Fig. 9. — Redshift evolution of the Lya transmission variance as a function of redshift for 
the complete sample of our synthetic spectra. The black triangles show the total variance of 
the mean transmitted flux (MTF). The dashed and dash-dot lines show the MLV- variance 
data in each resolution case which cross the distribution of individual LOS-variance values 
shown as colored-points (green/blue for FRES/HRES, red for LRES). The profiles evolve 
with redshift under a linear-log scaling law which fits the data up to 2 ~ 5.8 (Table-I). 
Between z = 5.8 — 6.25 the data vary within the standard deviation of the linear-log fit. At 
z > 6.25 all variance types steeply rise as we cross into the reionization epoch. 
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Fig. 10.— Top-left panel : Log{aM/MTF) (triangles) and Log{aL/MTF) (dashed line) 
versus redshift 

Top-right panel : { mtf ^ /^x where aj^ is either o-j^iv (mean los- variance: triangles) or 
{(7l/ MTFY (mean standard los- variance normalized to the mean transmitted flux: squares). 
Bottom-left panel : Skewness of the LOS-mean fluxes Fj (j=l,NLOS) 
Bottom-right panel: Kyrtosis of the LOS-mean fluxes Fj (j=l,NLOS) 
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Fig. 11. — On the left panel of Figure (11) we plot the relative to the MTF margin of error 
as it scales with redshift. At the 90% confidence level the margin of error is oc 60% the MTF 
value at z = 5.5 (oc 40% at CL=68%) and larger than 100% at z > 6.4. This shows that 75 
lines of sight undersample the distribution of mean fluxes during reionization. If we assume 
that the LOS-mean flux measurements are normally distributed at all redshifts then we can 
estimate the required NLOS number in order to determine the margin of error within 10% of 
the MTF. We would need 1200 lines of sight to determine the MTF within 10% (CL=90%) 
at 2; 5. The required number of LOS drops to about 600 at the 68% confidence level at 

Z 5. 
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Fig. 12. — Top-left panel: The solid line shows the redshift evolution of the MDF compared 
to the redshift evolution of the MTF (dashed line). The cross points were computed from 
lines of sight that have mean fluxes > 10~^ in the last redshift interval. 
Top-right panel: The sohd hnes are the redshift evolution of the flux variance (upper: FRES, 
lower: LRES) of the lines of sight with LOS-mean values within the mode of the LOS- 
means. The dashed lines are the redshift evolution of the MTF variance (upper: FRES, 
lower: LRES). 

Bottom-left panel: The solid line shows the redshift evolution of the ratio 
Logio{MDF/MTF). The cross points show the ratio between the mean flux computed 
from high-z high transmission lines of sight only over the MTF at each redshift. 
Bottom-right panel: The redshift evohition of the ratio of cr^d to auTF is shown in the two 
resolution cases (sohd: FRES, dashed: LRES). 
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Fig. 13. — Left panel: Discrete flux distribution (DFD) in the redshift interval [5.95,6.45]. 
Overplotted are the mean flux of the Becker gap (solid line) and the extremes of the 
transmitted flux (shaded area). Right panel: The mean transmitted flux in each of the 
flux bins of the DFD curve is plotted against the mean flux in each bin. 
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Fig. 14. — Cumulative flux distribution (CFD) in four redshift bins (from bottom up) 
(4.25-4.75), (4.95-5.45), (5.45-5.95) & (5.95-6.45). The line of sight fluxes in aU z-bins were 
convolved down to the LRES spectral resolution. The data points for the first three bins are 
extracted from the observational distribution on Figure (8) in Songaila et al (2002). In each 
transmitted flux bin (x-axis) we average the fraction (y-axis) from all LOS and compute 
the standard deviation. The shaded included between the ±2a curves. There is a 

general agreement with the observed data in the low and medium transmission regions. Any 
disagreement in the high transmission end of the x-axis has to take into consideration the 
sensitivity of the CFD to the extrapolated continuum in the observed data. 
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Fig. 15. — Frequency of contiguous gaps in our synthetic spectra for the four redshift intervals 
examined by Songaila (2002). We plot all three spectral resolution cases we studied (FRES: 
triangles, HRES: crosses, LRES: squares). Between z=3.5-5.5 there is no evident difference 
between the HRES and FRES cases. When compared to the higher resolution results 
the LRES case systematically underestimates the small gap widths while it systematically 
overestimates the large gap widths. In all cases, the GWD extends to larger gap-widths as 
the redshift increases while it simultaneously flattens in the small gap-widths range (GWW 
<3 A). 
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Fig. 16. — Same as in Figure (15) for the two high redshift intervals at z = 5.5 — 6.5. The 
GWD slowly evolves between z=5-6 to larger GWW values (> 30 A) and a flatter proflle 
at GWW < 6 A. The last point marks the progressive disappearance of small gaps which 
accelerates at z > 6. The decrease in the frequency of small gap sizes and the simultaneous, 
increase in the frequency of large gaps suggests a mechanism of gap merging that occurs as 
the opacity of the IGM increases close to reionization. 
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Fig. 17. — Evolution of the total number of gaps per redshift path (squares: LRES, crosses: 
HRES, triangles: FRES). Prior to entering the last stages of reionization the number of gaps 
peaks at 2; ~ 5.5 and then decreases due to "gap merging". 
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Fig. 18. — On the left panel we plot the log of the average gap width against rcdshift for the 
three resolutions examined, (squares: LRES, crosses: HRES, triangles: FRES). The mean 
gap width increases linearly with redshift up to z ~ 5.25 and then rapidly evolves under a 
hard power-law. On the right panel we plot the log of the ratio Icr error over the mean gap 
width. Large ratios (> 1) indicate significant scattering about the mean which is due to the 
GWD extending at larger gap widths close to the reionization phase. The dashed curves are 
fits to the data. A linear profile at 2; < 5.25 and a power-law profile at 2; > 5.75. Cubic 
interpolation between the data assisted in deriving the power-law fit. 
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Fig. 19. — Scatter plot between the mean Lya and the mean Ly/5 optical depths for each 
gap measured with wavelength width spread larger than 1 A and for optical depth pixels 
greater than 2.5. The slope of the straight line is 0.16, equal to the ratio /^M^ia^ xhe color 
table on the figure shows the allocation of color for each pair of optical depths based on their 
measured gap width. The points were plotted from the smallest to the largest gap values 
and therefore the colored symbols represent the largest gap value measured in the locale of 
each optical depth pair {TLyp,TLya). 
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Fig. 20. — Same as in Figure (19) but for the redshift intervals (5.5-6.0) & (6.0-6.5). Gaps 
with wavelength widths larger than 10^ ''^ ^ 56 A (orange color points) and 100 A (red color 
symbols) become evident in each of the two redshift intervals respectively. On the right 
panel we can see that the > 100 A gaps at 2; > 6 correspond to mean Lya optical depths 
larger than 100. 
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Fig. 21. — On the left panel we plot the redshift evolution of the optical depth at the 
cosmic mean density ta=i = Ug'^ (^)^-^ The three cases illustrated correspond to 
different choices for the normalized ionization rate. Prom top to bottom: rate from 
raw simulation data {g'^"'), rate modified by the ratio of baryon to HII clumping factors 
{9^^-§^) ^-nd the rate inferred by the fit of our optical depth data to the functional form of 
Equation (3). Overplotted is the redshift evolution of the effective optical depth (crosses) 
and ta = ta=i A^, for A = 0.36 and (from top to bottom) 7 = |, |, 1. On the right 
panel, we compute the overdensity where Tg// = ta=i A'^ in the redshift interval z=4-6.45 
for the normalized ionization rate types and adiabatic indices range mentioned. The figure 
shows that at high redshifts {z > 4) the effective optical depth derivation is biased by small 
over densities (A < 0.5). 
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Fig. 22. — Redshift evolution of the scatter plot between the Lya mean optical depth and 
the Lya effective optical depth of dark gaps. In the range fgap > 15 the two types are not 
correlated at 2; < 6 and weakly correlated at 2; > 6. For mean optical depths fgap < 15 there 
is a positive correlation which becomes stronger as the gap optical depth decreases. 
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Fig. 23. — Redshift evolution of the scatter plot between the characteristic overdensity 
Ac and the inferred average optical depth of a dark gap. The plots were compiled from 
Figure (22) through the equation = (:;^— j-)^*^^ for 7 = 4/3 and r =< Tgap > (mean optical 
depth) or T = '^eff- The figure shows that the gap effective optical depth is biased by smaller 
overdensities than the gap mean optical depth. As we approach reionization {z > 6) A^^^ is 
entirely in the underdense (A < 1) range. 
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Fig. 24. — Left Panel: Evolution of the median ratio between the Ly(3 and Lya optical 
depths from Figures (19,20) (blue histogram). For comparison, we overplot (red histogram) 
the median ratio derived using the gap effective optical depth. The ratio reflects an unbiased 
statistical relationship of the LjjS to Lya optical depth scatter inferred from Equation (5). 
As the redshift decreases both ratios decrease toward the expected asymptotic value (0.16) 
which in the case of the mean optical depth is reached by z ~ 3.5. 

Right Panel: Evolution of the Pearson Correlation Coefficient between the distributions of 
the gap Lja optical depth and the gap wavelength width (GWW). A "weak" (r ~ 0.5) but 
positive correlation is inferred between the gap mean optical depth and the gap size across 
the redshift range 3.5 < z < 6.5. In contrast, the the gap effective optical depth and gap size 
are uncorrelated at 2; < 6. A much weaker correlation, when compared to the mean optical 
case, is seen in the redshift interval Q < z < 6.5. 



